Abstract
Purpose: Molecular classification of breast cancer has been proposed based on gene expression profiles of human tumors. Luminal, basal-like, normal-like, and erbB2+ subgroups were identified and were shown to have different prognoses. The goal of this research was to determine if these different molecular subtypes of breast cancer also respond differently to preoperative chemotherapy.
Experimental Design: Fine needle aspirations of 82 breast cancers were obtained before starting preoperative paclitaxel followed by 5-fluorouracil, doxorubicin, and cyclophosphamide chemotherapy. Gene expression profiling was done with Affymetrix U133A microarrays and the previously reported “breast intrinsic” gene set was used for hierarchical clustering and multidimensional scaling to assign molecular class.
Results: The basal-like and erbB2+ subgroups were associated with the highest rates of pathologic complete response (CR), 45% [95% confidence interval (95% CI), 24-68] and 45% (95% CI, 23-68), respectively, whereas the luminal tumors had a pathologic CR rate of 6% (95% CI, 1-21). No pathologic CR was observed among the normal-like cancers (95% CI, 0-31). Molecular class was not independent of conventional cliniocopathologic predictors of response such as estrogen receptor status and nuclear grade. None of the 61 genes associated with pathologic CR in the basal-like group were associated with pathologic CR in the erbB2+ group, suggesting that the molecular mechanisms of chemotherapy sensitivity may vary between these two estrogen receptor–negative subtypes.
Conclusions: The basal-like and erbB2+ subtypes of breast cancer are more sensitive to paclitaxel- and doxorubicin-containing preoperative chemotherapy than the luminal and normal-like cancers.
Breast cancer is a clinically heterogeneous disease. Histologically similar tumors may have different prognoses and may respond to therapy differently. It is believed that these differences in clinical behavior are due to molecular differences between histologically similar tumors. DNA microarray technology is ideally suited to reveal such molecular differences. A novel molecular classification of breast cancer based on gene expression profiles was recently proposed (1). The investigators identified a set of stably expressed genes (“intrinsic gene set”; n = 534) that accounted for much of the molecular differences between 42 breast cancers and did hierarchical cluster analysis to identify subgroups of cancers with separate gene expression profiles. Luminal, basal-like, normal-like, and erbB2+ subgroups were identified and were shown to have different prognoses (1–4). These results were confirmed in follow-up experiments by the same group and others using larger numbers of cases. The basal-like (mostly estrogen receptor negative) and erbB2+ (mostly HER-2 amplified and estrogen receptor negative) subgroups had the shortest relapse-free and overall survival, whereas the luminal-type (estrogen receptor–positive) tumors had a more favorable clinical outcome (2–4). There is no published data on how the different molecular classes of breast cancer respond to chemotherapy. The goal of the current project was to examine if these different molecular subclasses of breast cancer also respond differently to anthracycline- and paclitaxel-containing preoperative chemotherapy.
Patients and Methods
Fine needle aspirations of breast cancer were collected in a prospectively designed pharmacogenomic marker discovery study at the Nellie B. Connally Breast Center of the University of Texas M.D. Anderson Cancer Center. The goal of the ongoing clinical study was to develop multigene predictors of pathologic complete response (CR) to preoperative therapy. The current analysis was undertaken to examine if molecular class is associated with sensitivity to chemotherapy. Gene expression results from the first 82 patients with stage I to III breast cancer were included in this analysis. Patient characteristics were presented in Table 1. Fine needle aspiration was done using a 23- or 25-gauge needle before starting preoperative chemotherapy with 12 weeks of paclitaxel followed by 5-fluorouracil, doxorubicin, and cyclophosphamide × 4 courses. Cells from 2 to 3 passes were collected into vials containing 1 mL of RNAlater solution (Ambion, Austin, TX) and stored at −80°C. Median RNA yield of the 82 specimens was 2.0 μg (1-22 μg). Approximately 70% of all aspirations yielded at least 1 μg total RNA, which is required for gene expression profiling. The main reason for failure to obtain sufficient RNA was acellular aspirations (low cell yield). The cellular composition of the fine needle aspiration samples was previously reported; in brief, fine needle aspiration samples on average contain 80% neoplastic cells and the rest of the cells are infiltrating leukocytes (5). These samples contain little or no stromal cells (fibroblasts and adipocytes) or normal breast epithelium. Of the 82 RNA specimens used in this analysis, 33 were included in a previous pharmacogenomic analysis using cDNA arrays (6). These 33 cases were profiled on both platforms (Affymetrix U133A and proprietary cDNA) and the results of the cross platform comparison of gene expression data were published separately (7). All patients underwent surgery after completion of 24 weeks of preoperative chemotherapy. Grossly visible residual cancer was measured and representative sections were submitted for routine histopathologic examination. When there was no grossly visible residual cancer, the slices of the specimen were radiographed and all areas of radiologically and/or architecturally abnormal tissue were entirely submitted for histopathologic study. Patients without any residual invasive cancer in the breast and axillary lymph nodes were considered to have pathologic CR. Patients with residual in situ cancer (DCIS) only were also considered to have pathologic CR. Estrogen receptor and HER-2 status was determined by routine clinical diagnostic methods [using mouse monoclonal anti–estrogen receptor antibody 6F11 (Novacastra/Vector Laboratories, Burlingame, CA) and fluorescence in situ hybridization assay to determine HER-2 amplification (PathVision kit, Vysis, Dovners Grove, IL)] on a diagnostic core needle biopsy obtained before or concomitant to the research fine needle aspiration. Nuclear grade was defined by the modified Black's nuclear grading system (1 = low grade, 2 = intermediate grade, and 3 = high grade; ref. 8). The study was approved by the Institutional Review Board of M.D. Anderson Cancer Center, and all patients signed an informed consent.
Female | 82 (100%) | |
Median age | 52 y (range 29-79) | |
Race | ||
Caucasian | 56 (68%) | |
African American | 11 (13%) | |
Asian | 7 (9%) | |
Hispanic | 6 (7%) | |
Mixed | 2 (2%) | |
Histology | ||
Invasive ductal | 73 (89%) | |
Mixed ductal/lobular | 6 (7%) | |
Invasive lobular | 1 (1%) | |
Invasive mucinous | 2 (2%) | |
Tumor-node-metastasis stage | ||
T1 | 7 (9%) | |
T2 | 46 (56%) | |
T3 | 15 (18%) | |
T4 | 14 (17%) | |
N0 | 28 (34%) | |
N1 | 38 (46%) | |
N2 | 8 (10%) | |
N3 | 8 (10%) | |
Nuclear grade (benign melanocytic nevus) | ||
1 | 2 (2%) | |
2 | 23 (37%) | |
3 | 35 (61%) | |
Estrogen receptor positive* | 35 (43%) | |
Estrogen receptor negative | 47 (57%) | |
HER-2 positive† | 57 (70%) | |
HER-2 negative | 25 (30%) | |
Neoadjuvant therapy‡ | ||
Weekly T (80 mg/m2) × 12 + FAC × 4 | 69 (84%) | |
3-weekly T (225 mg/m CI) × 4 + FAC × 4 | 13 (16%) | |
Pathologic CR | 21 (26%) | |
Residuald isease | 61 (74%) |
Female | 82 (100%) | |
Median age | 52 y (range 29-79) | |
Race | ||
Caucasian | 56 (68%) | |
African American | 11 (13%) | |
Asian | 7 (9%) | |
Hispanic | 6 (7%) | |
Mixed | 2 (2%) | |
Histology | ||
Invasive ductal | 73 (89%) | |
Mixed ductal/lobular | 6 (7%) | |
Invasive lobular | 1 (1%) | |
Invasive mucinous | 2 (2%) | |
Tumor-node-metastasis stage | ||
T1 | 7 (9%) | |
T2 | 46 (56%) | |
T3 | 15 (18%) | |
T4 | 14 (17%) | |
N0 | 28 (34%) | |
N1 | 38 (46%) | |
N2 | 8 (10%) | |
N3 | 8 (10%) | |
Nuclear grade (benign melanocytic nevus) | ||
1 | 2 (2%) | |
2 | 23 (37%) | |
3 | 35 (61%) | |
Estrogen receptor positive* | 35 (43%) | |
Estrogen receptor negative | 47 (57%) | |
HER-2 positive† | 57 (70%) | |
HER-2 negative | 25 (30%) | |
Neoadjuvant therapy‡ | ||
Weekly T (80 mg/m2) × 12 + FAC × 4 | 69 (84%) | |
3-weekly T (225 mg/m CI) × 4 + FAC × 4 | 13 (16%) | |
Pathologic CR | 21 (26%) | |
Residuald isease | 61 (74%) |
Cases where >10% of tumor cells stained positive for estrogen receptor with immunohistochemistry were considered positive.
Cases that showed gene copy number >2.0 were considered HER-2 positive.
T, paclitaxel; CI, 24-hour continuous infusion; and FAC, 5-flurouracil (500 mg/m2), doxorubicin (50 mg/m2), and cyclophosphamide (500 mg/m2).
RNA was extracted from fine needle aspiration samples using the RNAeasy Kit (Qiagen, Valencia CA). The amount and quality of RNA were assessed with a DU-640 UV Spectrophotometer (Beckman Coulter, Fullerton, CA) and by an Agilent 2100 Bioanalyzer RNA 6000 LabChip kit (Agilent Technologies, Palo Alto, CA). Profiling was done without second round amplification using a minimum of 1 μg total RNA. Double-stranded cDNA was synthesized, followed by in vivo transcription reaction to generate biotinylated cRNA. Biotin-labeled and fragmented cRNA was hybridized to Affymetrix U133A gene chips overnight at 42°C. The Affymetrix GeneChip system was used for hybridization and scanning and the dCHIP V1.3 (http://dchip.com) software was used to generate probe level signal and for normalization of data across arrays.
Data Analysis
dCHIP V1.3 software was used for normalization; this program normalizes all arrays to one standard array that represents a chip with median overall intensity. After normalization, estimates of feature level intensity were derived from the 75th percentile of pixel level intensity of each feature. Each individual probe was aggregated at the feature level to form a single measure of intensity for each probe set. We used the perfect match model. Statistical analysis was done by using the BRB-Arraytools version 3.0 software package (http://linus.nci.nih.gov/BRB-ArrayTools.html). Complete linkage hierarchical clustering was done with the previously published breast cancer intrinsic gene set with 1− Pearson correlation coefficient as distance metric (1). Cluster reproducibility and the robustness of the dendograms were examined by the method proposed by McShane et al. (9) based on 500 perturbations. Tumors clustering together in significant dendrogram branches were categorized as one molecular class. We also used multidimensional scaling with the Eucledian distance as metric to provide graphical representation of the distances among samples. This method also made it possible to test global statistical significance to determine whether the expression profiles form distinct clusters (rather than represent the same multivariate Gaussian distribution). Genes differentially expressed in a particular molecular class compared with all other tumors and between cases of pathologic CR and residual cancer within a single molecular subgroup were identified using the significance analysis of microarrays (SAM) software with 1,000 sample permutations. SAM uses permutations to estimate the false discovery rate and an adjustable threshold allows for control of the false discovery rate (10).
Pathologic complete response rates were calculated for each molecular class and assessed in univariate analysis (χ2 test) and multivariate analysis (logistic regression). Estrogen receptor and HER-2 status, nuclear grade, tumor size, and lymph node involvement were included in the multivariate analysis. We built logistic regression–based prediction models with various combinations of clinical variables and molecular class to examine if knowledge of the molecular class improves prediction accuracy above what can be achieved by combining routine clinical variables.
Results
Hierarchical clustering with the breast cancer intrinsic gene set reveals previously described molecular classes in fine needle aspiration specimens. The intrinsic breast cancer gene set consists of 534 genes of which expression showed significantly larger variation between tumors than between paired samples from the same tumor in an early seminal publication (1). Of these intrinsic genes, 424 were represented on the Affymetrix U133A chip. We did supervised hierarchical clustering with 689 Affymetrix probe sets that represented these 424 genes to define the molecular classes of breast tumors in our data. The tumors clustered into four major classes. The reproducibility indices of the four distinct clusters were 0.82, 0.76, 0.85, and 0.78, respectively, which indicates reasonably robust clusters (9). Tumors within each molecular subtypes corresponded well to the previously described clinicopathologic phenotypes of luminal (n = 30), normal-like (n = 10), basal-like (n = 22), and erbB2+ (n = 20) cancers (Fig. 1A). All of the luminal tumors were estrogen receptor positive by immunohistochemistry. All but two cases (80%) of the erbB2+ molecular class had HER-2 gene amplification by fluorescence in situ hybridization analysis. All but one of the basal-like tumors (95%) was estrogen receptor negative and 75% of these tumors were also high nuclear grade. These groups did not differ significantly in nodal status, tumor size, or patient age distribution (Fig. 1B). Multidimensional scaling analysis also confirmed the presence of significant clustering of the cases (global test of significance P = 0.04, Fig. 1C). To examine how sensitive the cluster results are to the actual gene set used for clustering, we did a multidimensional scaling analysis using the probe sets with the highest variance (top 10%) across all samples (2,228 probe sets including 229 overlapping probes with the intrinsic gene set). Cases with the same molecular class (as defined by the intrinsic gene set) continued to cluster together (global test of significance P = 0.047; Fig 1E). This suggests that the gene signature–based groups are robust.
To define the molecular differences further between the subgroups, we identified differentially expressed genes between the four molecular classes using SAM analysis on the most variably expressed probe sets (n = 2,228). Setting the most stringent false discovery rate at 0.0001, 372 probe sets representing 298 genes were identified as differentially expressed between the four distinct groups (Supplementary Table S1). The high expression of estrogen receptor 1 and several of the known estrogen receptor–inducible genes, such as X-box binding protein 1 and SLC39A6 among many others, characterized the luminal subgroup. The basal-like subgroup was characterized by the expression of keratin 17, keratin 5, and γ-aminobutyric acid receptor π subunit among others. The erbB2+ subtype was characterized by the overexpression of genes that are located in the HER-2 amplicon including erbB2 and GRB7. Interestingly, the normal-like group had only 15 genes that were overexpressed in this subgroup. These gene lists could be used to further characterize the various molecular subclasses and for the development of supervised molecular class prediction methods.
Correlation between molecular class and pathologic complete response to preoperative chemotherapy. The rates of pathologic CR differed significantly among the four molecular classes of breast cancer defined by clustering using the intrinsic gene set. Basal-like and erbB2+ subgroups were associated with the highest rate of pathologic CR, 45% [95% confidence interval (95% CI), 24-68] and 45% (95% CI, 23-68), respectively, whereas luminal tumors had a pathologic CR rate of 6% (95% CI, 1-21). No pathologic complete response was observed in the normal-like subclass (Table 2). We next used multidimensional scaling graph to explore if the breast intrinsic gene set can separate cases with pathologic CR versus those with residual disease (Fig. 1D). The global test of significance showed that the observed clusters of pathologic CR and residual disease were significantly separate (P = 0.026).
. | Pathologic complete response . | . | . | |
---|---|---|---|---|
. | No . | Yes . | . | |
Molecular classification | n [% (95% CI)] | n [% (95% CI)] | ||
Luminal A/B subtype | 28 [93% (78-99)] | 2 [7% (1-22)] | ||
Normal breast like | 10 [100% (29-100)] | 0 [0% (0-31)] | ||
erbB2+ | 11 [55% (32-77)] | 9 [45% (23-68)] | ||
Basal subtype | 12 [55% (32-76)] | 10 [45% (24-68)] | P < 0.001 |
. | Pathologic complete response . | . | . | |
---|---|---|---|---|
. | No . | Yes . | . | |
Molecular classification | n [% (95% CI)] | n [% (95% CI)] | ||
Luminal A/B subtype | 28 [93% (78-99)] | 2 [7% (1-22)] | ||
Normal breast like | 10 [100% (29-100)] | 0 [0% (0-31)] | ||
erbB2+ | 11 [55% (32-77)] | 9 [45% (23-68)] | ||
Basal subtype | 12 [55% (32-76)] | 10 [45% (24-68)] | P < 0.001 |
Next, we examined the clinical pathologic variables that were associated with pathologic CR in this data. Age < 50 years and estrogen receptor–negative status were identified as independent variables associated with higher likelihood of pathologic CR in multivariate analysis including age, estrogen receptor and HER-2 status, tumor size, clinical nodal status, and nuclear grade (Table 3). To examine if knowledge of the molecular class improves estimation of probability of pathologic CR beyond what can be achieved with routine clinical variables, we built three different logistic regression models including the clinical variables (age, tumor, node stage), the histopathologic variables (grade, estrogen receptor, and HER-2 status), and the molecular class in various combinations. For this analysis, we merged the luminal and the normal-like groups because there was no pathologic CR in the normal-like category and these tumors were phenotypically similar to the luminal tumors (HER-2 normal and estrogen receptor positive) that also had low pathologic CR rates. Molecular class was not independently associated with pathologic CR because of the high correlation between molecular class and estrogen receptor status and nuclear grade in this cohort. We constructed Receiver Operating Characteristic curves to measure the predictive accuracy of the logistic regression models including (a) clinical + pathologic variables, (b) clinical variables + molecular classification, and (c) clinical + pathologic variables + molecular class (Fig. 2). The three models yielded similar areas under the Receiver Operating Characteristic curve. This indicates that the molecular class alone can replace histopathologic characteristics (estrogen receptor, HER-2 status, or grade) for prediction of pathologic CR but provides little additional information when these characteristics are included. More directed supervised class prediction methods may be needed to develop a multigene predictor of pathologic CR. Such predictors can be developed by identifying informative genes that are differentially expressed between cases of pathologic CR and residual disease and combing these genes into a weighted prediction score or algorithm.
Variables . | Model 1: clinical and histologic variables . | . | Model 2: clinical variables and molecular classification . | . | Model 3: clinical, histologic variables, molecular classification . | . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
. | OR (95% CI) . | P . | OR (95% CI) . | P . | OR (95% CI) . | P . | ||||||
Age (y) | ||||||||||||
<50 | 1 | 1 | 1 | |||||||||
>50 | 0.27 (0.8-0.91) | 0.035 | 0.17 (0.06-0.45) | <0.001 | 0.43 (0.11-1.7) | 0.43 | ||||||
Tumor (cm) | ||||||||||||
<5 | 1 | 1 | 1 | |||||||||
>5 | 0.55 (0.14-2.3) | 0.41 | 0.53 (0.15-1.8) | 0.28 | 0.64 (0.14-2.9) | 0.56 | ||||||
Node | ||||||||||||
N0 | 1 | 1 | 1 | |||||||||
N1-3 | 0.96 (0.31-3.0) | 0.94 | 0.65 (0.22-2.0) | 0.18 | 0.90 (0.22-3.7) | 0.90 | ||||||
Estrogen receptor | ||||||||||||
Negative | 1 | 1 | ||||||||||
Positive | 0.12 (0.02-0.31) | <0.001 | 0.08 (0.02-0.35) | 0.001 | ||||||||
HER-2 | ||||||||||||
Negative | 1 | 1 | ||||||||||
Positive | 1.77 (0.42-7.5) | 0.43 | 0.32 (0.03-3.6) | 0.34 | ||||||||
Nuclear grade | ||||||||||||
1/2 | 1 | 1 | ||||||||||
3 | 2.6 (0.81-8.4) | 0.11 | 2.5 (0.4-13.6) | 0.30 | ||||||||
Histology | ||||||||||||
Ductal | 1 | 1 | ||||||||||
Other | 1.14 (0.17-7.5) | 0.89 | 2.3 (0.11-1.7) | 0.76 | ||||||||
Molecular classification | ||||||||||||
Luminal/normal-like | 1 | 1 | ||||||||||
Normal-like | 0 (0-…) | 0.99 | 0 (0-…) | 0.99 | ||||||||
Basal-like | 3.3 (1.0-11) | 0.06 | 0.8 (0.12-5.5) | 0.83 | ||||||||
erbB2+ | 4.4 (1.2-17) | 0.026 | 7.8 (0.62-100) | 0.11 |
Variables . | Model 1: clinical and histologic variables . | . | Model 2: clinical variables and molecular classification . | . | Model 3: clinical, histologic variables, molecular classification . | . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
. | OR (95% CI) . | P . | OR (95% CI) . | P . | OR (95% CI) . | P . | ||||||
Age (y) | ||||||||||||
<50 | 1 | 1 | 1 | |||||||||
>50 | 0.27 (0.8-0.91) | 0.035 | 0.17 (0.06-0.45) | <0.001 | 0.43 (0.11-1.7) | 0.43 | ||||||
Tumor (cm) | ||||||||||||
<5 | 1 | 1 | 1 | |||||||||
>5 | 0.55 (0.14-2.3) | 0.41 | 0.53 (0.15-1.8) | 0.28 | 0.64 (0.14-2.9) | 0.56 | ||||||
Node | ||||||||||||
N0 | 1 | 1 | 1 | |||||||||
N1-3 | 0.96 (0.31-3.0) | 0.94 | 0.65 (0.22-2.0) | 0.18 | 0.90 (0.22-3.7) | 0.90 | ||||||
Estrogen receptor | ||||||||||||
Negative | 1 | 1 | ||||||||||
Positive | 0.12 (0.02-0.31) | <0.001 | 0.08 (0.02-0.35) | 0.001 | ||||||||
HER-2 | ||||||||||||
Negative | 1 | 1 | ||||||||||
Positive | 1.77 (0.42-7.5) | 0.43 | 0.32 (0.03-3.6) | 0.34 | ||||||||
Nuclear grade | ||||||||||||
1/2 | 1 | 1 | ||||||||||
3 | 2.6 (0.81-8.4) | 0.11 | 2.5 (0.4-13.6) | 0.30 | ||||||||
Histology | ||||||||||||
Ductal | 1 | 1 | ||||||||||
Other | 1.14 (0.17-7.5) | 0.89 | 2.3 (0.11-1.7) | 0.76 | ||||||||
Molecular classification | ||||||||||||
Luminal/normal-like | 1 | 1 | ||||||||||
Normal-like | 0 (0-…) | 0.99 | 0 (0-…) | 0.99 | ||||||||
Basal-like | 3.3 (1.0-11) | 0.06 | 0.8 (0.12-5.5) | 0.83 | ||||||||
erbB2+ | 4.4 (1.2-17) | 0.026 | 7.8 (0.62-100) | 0.11 |
NOTE: Multivariate analysis of different combinations of clinical (age, tumor, node) and histopathologic characteristics (grade, estrogen receptor and HER-2 status, and histologic type) and molecular class as variables. Three distinct prediction models were examined: clinical plus histopathologic variables (model 1), clinical variables plus molecular class (model 2), and all three types of information together (model 3).
Genes associated with pathologic complete response in the different molecular subgroups. We next examined if the genes of which expression was associated with pathologic CR differed between basal-like and erbB2+ subtypes. Because only two cases of pathologic CR were observed in the luminal group and no pathologic CR was seen in the normal-like group, these groups were not included in this exploratory analysis. Seventy-two probe sets (corresponding to 61 genes) were differentially expressed between basal-like tumors that achieved a pathologic CR and those that did not (Table 4). All highly variable genes (n = 2,228) were used in this analysis and the false discovery rate was set at 5%. Interestingly, within the erbB2+ group, zero genes were identified at false discovery rate < 10%. If the false discovery rate was set at 50%, 16 probe sets (15 genes) were identified; however, half of these could represent spurious discovery (data not shown). A greater variance of gene expression among erbB2+ tumors and, therefore, greater molecular heterogeneity compared with basal-like tumors combined with the small sample size (erbB2+; n = 20) may explain why it was difficult to identify genes in this group. Importantly, none of the genes associated with pathologic CR in the basal-like group was associated with pathologic CR in the erbB2+ group. We also assessed if there was a correlation between fold differences of expression of pathologic CR-associated genes in cases with pathologic CR compared with residual disease in basal-like and erbB2+ tumors, respectively. There was no correlation (P = 0.19). The absence of correlation suggests that genes associated with chemotherapy sensitivity are different between these two molecular subgroups of breast cancer.
. | Probe set . | Gene symbol . | Fold difference of means between residual disease and pathologic CR . |
---|---|---|---|
1 | 213060_s_at* | CHI3L2 | 4.117 |
2 | 200052_s_at | ILF2 | 2.645 |
3 | 213338_at | RIS1 | 3.181 |
4 | 214433_s_at* | SELENBP1 | 2.735 |
5 | 204319_s_at | RGS10 | 2.183 |
6 | 213005_s_at | ANKRD15 | 2.517 |
7 | 221561_at | SOAT1 | 2.156 |
8 | 212190_at | SERPINE2 | 2.448 |
9 | 209387_s_at | TM4SF1 | 3.096 |
10 | 221727_at | PC4 | 1.842 |
11 | 213716_s_at | SECTM1 | 2.183 |
12 | 203165_s_at | SLC33A1 | 1.826 |
13 | 207414_s_at | PACE4 | 1.732 |
14 | 201819_at | SCARB1 | 2.118 |
15 | 217983_s_at | RNASET2 | 1.915 |
16 | 214540_at | HIST1H2BO | 1.864 |
17 | 218538_s_at | MRS2L | 1.794 |
18 | 202506_at | SSFA2 | 1.678 |
19 | 215071_s_at | HIST1H2AC | 2.099 |
20 | 202988_s_at | RGS1 | 2.328 |
21 | 220624_s_at | ELF5 | 2.521 |
22 | 221505_at* | ANP32E | 2.51 |
23 | 208370_s_at | DSCR1 | 2.332 |
24 | 204407_at | TTF2 | 1.913 |
25 | 218398_at | MRPS30 | 1.595 |
26 | 213754_s_at | TRIM26 | 1.844 |
27 | 210147_at | ART3 | 3.943 |
28 | 204809_at | CLPX | 1.813 |
29 | 202035_s_at | SFRP1 | 6.461 |
30 | 209389_x_at | DBI | 1.792 |
31 | 201897_s_at | CKS1B | 1.937 |
32 | 209142_s_at | UBE2G1 | 1.746 |
33 | 209340_at | UAP1 | 1.724 |
34 | 203362_s_at | MAD2L1 | 1.967 |
35 | 217028_at | CXCR4 | 2.097 |
36 | 205044_at* | GABRP | 6.224 |
37 | 36711_at | MAFF | 2.271 |
38 | 202023_at | EFNA1 | 1.721 |
39 | 212915_at | PDZRN3 | 2.539 |
40 | 217851_s_at | C20orf45 | 1.734 |
41 | 211762_sat | KPNA2 | 1.849 |
42 | 213134_x_at* | BTG3 | 2.007 |
43 | 204162_at | KNTC2 | 2.283 |
44 | 212276_at | LPIN1 | 2.219 |
45 | 219768_at | B7-H4 | 1.843 |
46 | 209551_at | MGC11061 | 1.925 |
47 | 203744_at | HMGB3 | 1.457 |
48 | 200975_at | PPT1 | 1.628 |
49 | 221931_s_at | SEC13L | 1.796 |
50 | 209786_at | HMGN4 | 1.63 |
51 | 218963_s_at | KRT23 | 3.088 |
52 | 219209_at | MDA5 | 2.337 |
53 | 214214_s_at | C1QBP | 1.799 |
54 | 209656_s_at | TM4SF10 | 2.645 |
55 | 203706_s_at* | FZD7 | 2.603 |
56 | 206055_sat | SNRPA1 | 1.799 |
57 | 204825_at | MELK | 1.735 |
58 | 212762_s_at | TCF7L2 | 1.928 |
59 | 203423_at | RBP1 | 1.82 |
60 | 210605_s_at* | MFGE8 | 2.085 |
61 | 214835_s_at | SUCLG2 | 1.576 |
. | Probe set . | Gene symbol . | Fold difference of means between residual disease and pathologic CR . |
---|---|---|---|
1 | 213060_s_at* | CHI3L2 | 4.117 |
2 | 200052_s_at | ILF2 | 2.645 |
3 | 213338_at | RIS1 | 3.181 |
4 | 214433_s_at* | SELENBP1 | 2.735 |
5 | 204319_s_at | RGS10 | 2.183 |
6 | 213005_s_at | ANKRD15 | 2.517 |
7 | 221561_at | SOAT1 | 2.156 |
8 | 212190_at | SERPINE2 | 2.448 |
9 | 209387_s_at | TM4SF1 | 3.096 |
10 | 221727_at | PC4 | 1.842 |
11 | 213716_s_at | SECTM1 | 2.183 |
12 | 203165_s_at | SLC33A1 | 1.826 |
13 | 207414_s_at | PACE4 | 1.732 |
14 | 201819_at | SCARB1 | 2.118 |
15 | 217983_s_at | RNASET2 | 1.915 |
16 | 214540_at | HIST1H2BO | 1.864 |
17 | 218538_s_at | MRS2L | 1.794 |
18 | 202506_at | SSFA2 | 1.678 |
19 | 215071_s_at | HIST1H2AC | 2.099 |
20 | 202988_s_at | RGS1 | 2.328 |
21 | 220624_s_at | ELF5 | 2.521 |
22 | 221505_at* | ANP32E | 2.51 |
23 | 208370_s_at | DSCR1 | 2.332 |
24 | 204407_at | TTF2 | 1.913 |
25 | 218398_at | MRPS30 | 1.595 |
26 | 213754_s_at | TRIM26 | 1.844 |
27 | 210147_at | ART3 | 3.943 |
28 | 204809_at | CLPX | 1.813 |
29 | 202035_s_at | SFRP1 | 6.461 |
30 | 209389_x_at | DBI | 1.792 |
31 | 201897_s_at | CKS1B | 1.937 |
32 | 209142_s_at | UBE2G1 | 1.746 |
33 | 209340_at | UAP1 | 1.724 |
34 | 203362_s_at | MAD2L1 | 1.967 |
35 | 217028_at | CXCR4 | 2.097 |
36 | 205044_at* | GABRP | 6.224 |
37 | 36711_at | MAFF | 2.271 |
38 | 202023_at | EFNA1 | 1.721 |
39 | 212915_at | PDZRN3 | 2.539 |
40 | 217851_s_at | C20orf45 | 1.734 |
41 | 211762_sat | KPNA2 | 1.849 |
42 | 213134_x_at* | BTG3 | 2.007 |
43 | 204162_at | KNTC2 | 2.283 |
44 | 212276_at | LPIN1 | 2.219 |
45 | 219768_at | B7-H4 | 1.843 |
46 | 209551_at | MGC11061 | 1.925 |
47 | 203744_at | HMGB3 | 1.457 |
48 | 200975_at | PPT1 | 1.628 |
49 | 221931_s_at | SEC13L | 1.796 |
50 | 209786_at | HMGN4 | 1.63 |
51 | 218963_s_at | KRT23 | 3.088 |
52 | 219209_at | MDA5 | 2.337 |
53 | 214214_s_at | C1QBP | 1.799 |
54 | 209656_s_at | TM4SF10 | 2.645 |
55 | 203706_s_at* | FZD7 | 2.603 |
56 | 206055_sat | SNRPA1 | 1.799 |
57 | 204825_at | MELK | 1.735 |
58 | 212762_s_at | TCF7L2 | 1.928 |
59 | 203423_at | RBP1 | 1.82 |
60 | 210605_s_at* | MFGE8 | 2.085 |
61 | 214835_s_at | SUCLG2 | 1.576 |
NOTE: There are 72 probes sets significant by SAM, corresponding to 61 genes. The median false discovery rate among the 72 significant genes is 0.04. Genes are ranked by significance.
Discussion
The goal of this current project was to further evaluate the clinical relevance of a novel gene expression–based classification system of breast cancer. This new classification is based on gene expression signatures of variably expressed genes in breast cancer (1). It has previously been shown that the various molecular classes have different long-term survivals (1–4). However, it is not possible to decipher from these earlier studies if the differences in survival are due to different metastatic potentials or to different sensitivities to adjuvant chemotherapy or hormonal therapy because the patients included in these studies received various forms of multimodality treatment. In the current study, we examined newly diagnosed stage I to III breast cancers that all received preoperative treatment with anthracycline and taxane followed by surgery to determine if the different molecular classes show different chemotherapy sensitivities based on pathologic response to preoperative chemotherapy.
All previous reports on molecular classification used frozen breast cancer tissues for gene expression profiling. The current study differs from these in that we used fine needle aspiration specimens. Surgically resected cancer tissues differ from fine needle aspiration in cellular composition. The fine needle aspiration material contains 80% to 90% pure neoplastic cells whereas surgical biopsies or core needle biopsies contain a variable amount of stromal cells. It was therefore of interest to determine if the intrinsic gene set that discriminated molecular class in surgical specimens could also separate molecular classes of breast cancer in fine needle aspiration data. If such separation can be observed, this would suggest that these informative genes are primarily expressed in neoplastic cells rather than in stromal cells.
In the current study, we did hierarchical clustering and multidimensional scaling analysis using the breast cancer intrinsic gene set which mimics the original class discovery process because there are presently no uniformly accepted class prediction tools to define the molecular classes of breast cancer utilizing gene expression data. We observed very similar results in our fine needle aspiration data as was reported by others on surgical tissues. The two most readily distinguishable molecular classes of breast cancer are the basal-like and luminal subtypes whereas the normal-breast like class is the least robust. This may be due to the possibility that the original samples in this category contained significant amount of contaminating normal breast tissue. The basal-like, erbB2+, and luminal subclasses were distinguished by some of the same genes and histologic phenotypes in our series as previously reported. This supports the hypothesis that these clusters represent genuinely different diseases within breast cancer (3, 4).
The different molecular classes of breast cancer showed different sensitivities to preoperative chemotherapy. The basal-like and erbB2+ subgroups had the highest rates of pathologic CR, 45% (95% CI, 23-68). The luminal and normal-like tumors had low pathologic CR rates of 6% (95% CI, 1-21) and 0% (95% CI, 0-21%), respectively. However, the pathologic predictors of response (i.e., grade and estrogen receptor status). The basal-like and erbB2+ tumors were predominantly high nuclear grade and the basal-like tumors were almost all estrogen receptor negative. Both of these characteristics are known to be associated with higher likelihood of pathologic CR to preoperative chemotherapy (11–13). Because of this association, incorporation of molecular class into a logistic regression–based predictor of response did not improve the prediction accuracy compared with using routine clinical and pathologic variables only. Therefore, it is likely that more focused gene signature–based predictors will need to be developed through supervised outcome prediction methods.
How to define the best multigene predictor of response to chemotherapy is not known. One approach is to group all breast cancers into either responders (e.g., pathologic CR) or nonresponders (e.g., residual disease), define the gene expression differences between these groups, and use this information to construct a response prediction score or machine learning–based predictor. This approach was successfully applied to develop prognostic signatures for breast cancer and was also promising in small pilot studies of chemotherapy response prediction (6, 14, 15). To develop the best possible supervised classifier for prediction of pathologic CR from this data set was not the goal of this current analysis. However, if distinct molecular classes of breast cancer exist, one could hypothesize that stratification of patients by molecular class may yield more accurate class-specific predictors than unstratified use of the data.
As an exploratory analysis, we attempted to define the molecular differences between tumors that are extremely chemotherapy sensitive (pathologic CR) and those that are more resistant (residual disease) within the basal-like and erbB2+ groups, separately. In the basal-like group (n = 22, including 10 pathologic CR), 61 genes were identified that were statistically significantly associated with pathologic CR. It is important to realize that none of these genes are associated with estrogen receptor status or high grade (the two conventional strong predictors of pathologic CR) because the basal-like group almost exclusively consists of high-grade and estrogen receptor–negative tumors. We could not define a robust gene set that correlated with pathologic CR in the erbB2+ group (n = 20, including 9 pathologic CR). Importantly, the genes that were associated with pathologic CR in the basal-like group were not associated with pathologic CR in the erbB2+ group. This suggests that distinct sets of genes are associated with pathologic CR in the different molecular classes.
It is tempting to speculate on the biological function of the genes that are differentially expressed between cases with pathologic CR and those with residual cancer. However, not all of these genes may play a causative role in determining sensitivity to chemotherapy. Some of these may be distant downstream transcriptional effects of biological events that influence drug sensitivity and a few could represent spurious discovery. From the vantage point of gaining mechanistic insight into the biology of chemotherapy sensitivity or resistance, these gene lists should be regarded as hypothesis-generating and will require further in vitro experimentation to show a functional role for any particular molecule.
In summary, these results indicate that the major molecular classes of breast cancer can be detected in gene expression data regardless of tissue sampling method (i.e., fine needle aspirations, core needle, or surgical biopsies). The different molecular classes of breast cancer not only have different prognoses but also show distinct sensitivities to preoperative chemotherapy. The basal-like and erbB2+ subtypes of breast cancer are more sensitive to paclitaxel- and doxorubicin-containing preoperative chemotherapy than the luminal and normal-like cancers. The genes associated with pathologic CR were different between the basal-like and erbB2+ subgroups, which suggest that the mechanisms of chemotherapy sensitivity may vary across the subtypes. The possibility that distinct predictive signatures can be developed for the different molecular subtypes of breast cancer warrants further examination.
Grant support: The Nellie B. Connally Breast Cancer Research Fund, Millennium Pharmaceuticals, The Dee Simmons Fund, University of Texas M.D. Anderson Cancer Center Aventis Drug Development Award (L. Pusztai), The Susan G. Komen Breast Cancer Foundation (grant LF2002-044HM; W.F. Symmans), Association pour la Recherche sur le Cancer (R. Rouzier), and National Cancer Institute Specialized Program of Research Excellence in Breast Cancer (grant P50-CA58223-09A1; C.M. Perou).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/).