Purpose: We sought to systematically define determinants of the response to neoadjuvant chemotherapy to elucidate predictive biomarkers for breast cancer.

Experimental Design: An unbiased systematic analysis was performed in multiple independent datasets to define genes predictive of complete pathologic response (pCR) following treatment with neoadjuvant chemotherapy. These genes were interrogated across estrogen receptor (ER)–positive and ER-negative breast cancer and those in common across three different treatment regimens were analyzed for optimal predictive power. Subsequent validation was performed on independent cohorts by gene expression and IHC analyses.

Results: Genes that were highly associated with the response to neoadjuvant chemotherapy in breast cancer were readily defined using a computational method ranking individual genes by their respective ROC. Such predictive genes of the response to taxane-associated therapies were strongly enriched for cell-cycle control processes in both ER-positive and ER-negative breast cancer and correlated with pCR. However, other genes that were specifically associated with residual disease were also identified under other treatment conditions. Using the intersection between treatment groups, nine genes were identified that harbored strong predictive power in multiple contexts and validation cohort. In particular, the nuclear oncogene DEK was strongly associated with pCR, whereas the cell surface protein BCAM was strongly associated with residual disease. By IHC staining, these markers exhibited potent predictive power that remained significant in multivariate analysis.

Conclusion: Systematic computational approaches can define key genes that will be able to predict the response to chemotherapy across multiple treatment modalities yielding a small collection of biomarkers that can be readily deployed by IHC analyses. Clin Cancer Res; 20(18); 4837–48. ©2014 AACR.

Translational Relevance

Currently, there are no clinically used markers to define patients that will benefit from neoadjuvant chemotherapy. Here, an unbiased systematic approach was used to define pathways and specific markers associated with the response to neoadjuvant chemotherapy in breast cancer. These analyses revealed that genes involved in cell-cycle control processes that are regulated by the RB/E2F pathway are significantly associated with response to chemotherapy in both ER-positive and ER-negative breast cancer. However, additional genes were identified that were predictive of response, particularly across different therapeutic regimens. Importantly, identified genes associated with pathologic complete response or residual disease were evaluated in independent cohorts by gene expression and IHC, demonstrating strong predictive power. Together, these data suggest that a relatively small number of biomarkers can be identified to predict response to neoadjuvant chemotherapy.

Although breast cancer is treated with a variety of targeted agents, conventional cytotoxic chemotherapy remains a mainstay of therapy (1–4). At present, complex chemotherapy regimens are applied in multiple distinct clinical scenarios in the treatment of breast cancer. It is well appreciated that triple-negative breast cancer is treated largely exclusively with chemotherapy (2, 5, 6); however, other forms of breast cancer are also treated with chemotherapy. For example, luminal B breast cancer is often treated with adjuvant chemotherapy in conjunction with estrogen receptor (ER)–targeted therapeutics (7–10). Similarly, Her2-positive cancers are treated with trastuzumab in conjunction with taxane-based chemotherapy (11). In all of these contexts, it is critically important to elucidate determinants of the response to chemotherapy.

One means to evaluate the response to chemotherapy in clinical specimens involve the analyses of the response to neoadjuvant chemotherapy (2, 12, 13). Although historically surgery has preceded treatment with adjuvant therapy, there has been a significant increase in neoadjuvant therapy (14, 15). Studies have shown that the response to neoadjuvant therapy is effective at predicting the ultimate course of tumor behavior and specific determinants of that response are being sought (2, 12, 16, 17). Importantly, pathologic response in neoadjuvant studies reveals tumor response to a given therapy independent of other prognostic features of disease, and therefore markers defined in the analyses of neoadjuvant treatment could be inferred to portend activity in the adjuvant setting as well.

Several studies have analyzed the gene expression programs associated with response to neoadjuvant chemotherapy (16–18). Our group and others have analyzed specific gene expression programs associated with response to chemotherapy. These studies have indicated that gene expression programs involved in RB/E2F biology or proliferation-associated properties are associated with pathologic complete response (19, 20). In contrast, others have used datasets to infer predictive markers using supervised computational approaches (16, 17, 21, 22). Here, we sought to use a simple method to identify individually predictive genes that can be used singly or in combination across chemotherapy regimens and disease subtypes that could be used to direct therapy. These small number of genes returned by such a method can be individually analyzed by IHC or other methods that are readily amenable to clinical utilization.

Datasets

Raw CEL files and platform annotations for gene expression datasets GSE20194, GSE20271, GSE22093, GSE23988, and GSE25066, GSE41998, GSE2226 were downloaded from the Gene Expression Omnibus (GEO). A comprehensive summary of the cohorts and related citations is provided in the Supplementary data (Supplementary Table S4). Datasets were normalized by the robust multiarray average algorithm from limma Bioconductor packages in R. We assumed that the probes annotation supplied by the array manufacturer were accurate. For genes with multiple probe sets, we averaged the gene expression levels. Patients without response information will be excluded from analysis. We pooled the neoadjuvant patients from the GEO datasets GSE20194, GSE20271, GSE22093, GSE23988, and GSE25066 to develop the “NEO” dataset. The datasets GSE41998 and GSE2226 were used for independent validation (Supplementary Table S4).

ROC screening

The area under ROC curve (AUC) was used to screen the genes according to their ability to distinguish between two phenotypes, and the AUC was calculated by R-package ROCR. We ran ROCR for all the genes in the microarray from different treatment types (TA, TFAC, and FAC), different subtypes (ER/ER±), and pooled NEO dataset. We ranked genes from different categories based on the patients who either had a complete response (pCR) or retained residual disease (RD) after chemotherapy. The ranking of the genes was performed for genes predictive of prognosis.

Gene module summarization

To study associations between gene signature modules and clinical responses, we summarized the signatures to a single feature for classification. This approach was applied to the RB signature, Genome Grade Index, Mammaprint, OncotypeDx, and CIN70 gene signatures. We averaged the group of genes expression values, which were then used as a feature for ROC screening. For the analysis of the Theraprint genes, single genes were analyzed for their association with clinical outcome by evaluating their ROC characteristics individually.

UTSW cohort and IHC staining

A cohort of 74 consecutive cases for which tissue was available, and who were treated with neoadjuvant chemotherapy between 2010 and 2012 were used for IHC validation analysis. Clinicopathologic features of the cohort are summarized in Supplementary Table S3. IHC stains for DEK (Santa Cruz Biotechnology, cat #: sc-30213, dil 1:200) and BCAM (Santa Cruz Biotechnology, cat #: sc-46795, dil 1:100) were performed on pretreatment biopsy. The stains were performed using a BenchMark XT stainer (Ventana Medical Systems).

Cell-cycle regulatory genes are potent predictors of response to neoadjuvant chemotherapy

To define the key determinants of chemotherapy, response data from several treatment groups were utilized. Because most patients with Her2-positive disease are also treated with trastuzumab (23), such patients were removed from the analysis; thus, these cohorts represent ER-positive disease Her2-negative breast cancer and triple-negative breast cancer. Gene expression data of pretreatment biopsies from patients treated with taxane and anthracycline (TA) and taxane, 5-fluorouracil, anthracycline, and cytoxan (TFAC) were evaluated (Cohort information, Supplementary Table S1). Recognizing that each individual gene could have some intrinsic predictive value, a simple method of ranking each gene based on its receiver operating curve characteristic was used in individual datasets (Fig. 1). This analysis provided a rank ordering of individual genes associated with response (Fig. 1A). The top 20 predictive AUC values were comparable between the two datasets (0.73–0.79 vs. 0.75–0.81). Interestingly, although the TFAC and TA datasets are independent, the same gene, the nuclear oncogene DEK, was found to have the top predictive power (Fig. 1A). Gene ontology analysis of the top 150 genes by AUC ranking demonstrated that in both TA and TFAC, there was a significant overrepresentation of genes associated with cell cycle (Fig. 1B). Notably, many of these genes are regulated by the RB/E2F pathway that has been independently associated with response to neoadjuvant chemotherapy (19, 24). Consistent with these findings, analysis of multiple signatures associated with cell-cycle genes including the GGI signature (25), the CIN70 signature (26), the OncotypeDx proliferation genes (27), Mammaprint (70 gene signature; ref. 28), and the RB signature (29) exhibited potent predictive values in the TA dataset and TFAC dataset (Fig. 1C and Supplementary Figs. S1 and S2). Unsupervised analyses demonstrated a clear association of a high-signature value with pCR (Fig. 1C). In addition, each signature had potent ROC characteristics in the datasets (Fig. 1D). Thus, these findings reinforce the concept that cell-cycle–regulated genes can have profound influence related to the response to neoadjuvant chemotherapy in breast cancer.

Figure 1.

Top predictive markers of pathological response in patients treated with taxane/anthracycline based therapies: A, gene expression data were mined to define genes with the top-ranked ROC characteristics in cases treated with TA and TFAC. The top 20 genes and their associated AUC values are shown for two cohorts. B, gene ontology analysis was performed on the top 150 predictive genes identified in each cohort. C, the indicated gene expression signatures were used to cluster cases based on signature expression value. The Kolmogorov–Smirnov (KS) statistic was used to determine the association of the signature value with clinical response. D, ROC curves of the indicated gene signatures that are enriched for cell-cycle/proliferation associated genes.

Figure 1.

Top predictive markers of pathological response in patients treated with taxane/anthracycline based therapies: A, gene expression data were mined to define genes with the top-ranked ROC characteristics in cases treated with TA and TFAC. The top 20 genes and their associated AUC values are shown for two cohorts. B, gene ontology analysis was performed on the top 150 predictive genes identified in each cohort. C, the indicated gene expression signatures were used to cluster cases based on signature expression value. The Kolmogorov–Smirnov (KS) statistic was used to determine the association of the signature value with clinical response. D, ROC curves of the indicated gene signatures that are enriched for cell-cycle/proliferation associated genes.

Close modal

Distinct predictive genes emerge from different treatment cohorts

Interestingly, in the analysis of another cohort of patients treated with 5-fluorouracil, anthracycline, and Cytoxan (FAC), a completely distinct cadre of genes with top AUC was observed (Fig. 2A). In this cohort, although the range of top AUC values was similar to those in TA and TFAC, there were surprisingly very few genes in common. In addition, in the analysis of gene ontology, there was no enrichment for cell-cycle–associated processes as was observed in the TA and TFAC data (Fig. 2B). In part, this is because the majority of the high-ranked AUC genes are associated with residual disease as opposed to pCR as is observed with cell-cycle genes in the TA and TFAC cohorts (Fig. 2C). These data were recapitulated at the single-gene level. For example, the top performing gene in the FAC dataset (IFI16) had little predictive value in TA/TFAC (Fig. 2C). In contrast, TTK was highly predictive in TA and TFAC but not FAC (Fig. 2D). These data suggest that it may be particularly challenging to define biomarkers of response that would be useful across multiple manifestations of disease subtypes and treatment approaches.

Figure 2.

Distinct top predictive markers in cohorts treated with anthracycline/cytoxan: A, gene expression data were mined to define genes with the top-ranked ROC characteristics in cases treated with FAC. The top 20 genes and their associated AUC values are shown for the cohort. B, gene ontology analysis was performed on the top 150 predictive genes identified in the cohort. C, genes predictive in the FAC cohort were used to cluster cases based on signature expression values. The Kolmogorov–Smirnov (KS) statistic was used to determine the association of the signature value with clinical response. D, example of a gene that is predictive in TA/TFAC cohorts, but not FAC cohort. E, example of a gene that is predictive in FAC cohort, but not the TA/TFAC cohorts.

Figure 2.

Distinct top predictive markers in cohorts treated with anthracycline/cytoxan: A, gene expression data were mined to define genes with the top-ranked ROC characteristics in cases treated with FAC. The top 20 genes and their associated AUC values are shown for the cohort. B, gene ontology analysis was performed on the top 150 predictive genes identified in the cohort. C, genes predictive in the FAC cohort were used to cluster cases based on signature expression values. The Kolmogorov–Smirnov (KS) statistic was used to determine the association of the signature value with clinical response. D, example of a gene that is predictive in TA/TFAC cohorts, but not FAC cohort. E, example of a gene that is predictive in FAC cohort, but not the TA/TFAC cohorts.

Close modal

Identification of genes that are predictive of response to chemotherapy in both ER-positive and ER-negative breast cancer

Although both ER-positive and ER-negative breast cancers are treated with similar chemotherapy regimens in the neoadjuvant setting, there is clearly a distinction in the relative response between these two forms of disease (6, 20, 30). Therefore, we initially evaluated determinants that may be specific for ER-positive versus ER-negative breast cancer. To this end, we combined all treatment groups to build a large single dataset (Supplementary Table S1). In this cohort of 994 cases, cell-cycle–associated genes such as DEK, ANP32E, and MCM3 were particularly potent markers of response (Fig. 3A). Consistent with this analysis, gene ontology demonstrated the corresponding enrichment for terms associated with cell-cycle control (Fig. 3A). To interrogate genes selective to ER positive, ER negative, and either subtype of breast cancer, simple AUC ranking was applied to each subtype independently (Fig. 3B). Interestingly, in both cases cell-cycle–associated genes were still highly represented, although in ER-negative DNA replication and DNA repair terms illustrated the highest representation, whereas in ER-positive disease mitosis-related processes were highly overrepresented (Fig. 3B). Interestingly, when established cell-cycle signatures were evaluated in these cohorts based on ER status, they exhibited differential predictive power (Supplementary Fig. S3). For example, the GGI and OncotypeDx proliferation signatures performed best in ER-positive cases, whereas the RB signature performed best in ER-negative cases. The differential behaviors of specific genes were clearly apparent in the analysis of single genes (Fig. 3C and D). ILF2 is particularly relevant for ER-negative cancer, whereas ASPM is relevant largely for ER-positive breast cancer. To approach the question of overlap between these groups, we simply evaluated the union of the top performing 150 genes in each subtype. Surprisingly only 19 genes were in common between ER-positive and ER-negative breast cancer and of these genes, 18 were associated with pCR (Fig. 3E and F). As shown by GMMN, such genes were relatively effective predictors in both ER-positive and ER-negative breast cancer (Fig. 3G).

Figure 3.

Definition of genes that harbor strong predictive value in discrete breast cancer subtypes: A, gene expression data were mined to define genes with the top-ranked ROC characteristics across all treatment groups. The top 20 genes and their associated AUC values are shown. Gene ontology analysis was performed on the top 150 predictive genes identified in the cohort. B, the integrated cohort was subdivided and top predictive genes unique to ER-positive and ER-negative breast cancer was determined by ROC analysis. Top 20 genes are shown for each subtype. Gene ontology analysis was performed on the top 150 predictive genes identified in each disease subtype. C, example of a gene that is predictive in ER-positive but not ER-negative breast cancer. D, example of a gene is predictive in ER-negative but not ER-positive breast cancer. E, intersection of the predictive genes in ER-positive and ER-negative subtypes and genes with the top AUC values are shown. F, the genes predictive in both ER-positive and ER-negative disease were used to cluster cases based on signature expression value (all cases, ER positive, and ER negative). The Kolmogorov–Smirnov (KS) statistic was used to determine the association of the signature value with clinical response. G, example of a gene predictive in both ER-negative and ER-positive breast cancer.

Figure 3.

Definition of genes that harbor strong predictive value in discrete breast cancer subtypes: A, gene expression data were mined to define genes with the top-ranked ROC characteristics across all treatment groups. The top 20 genes and their associated AUC values are shown. Gene ontology analysis was performed on the top 150 predictive genes identified in the cohort. B, the integrated cohort was subdivided and top predictive genes unique to ER-positive and ER-negative breast cancer was determined by ROC analysis. Top 20 genes are shown for each subtype. Gene ontology analysis was performed on the top 150 predictive genes identified in each disease subtype. C, example of a gene that is predictive in ER-positive but not ER-negative breast cancer. D, example of a gene is predictive in ER-negative but not ER-positive breast cancer. E, intersection of the predictive genes in ER-positive and ER-negative subtypes and genes with the top AUC values are shown. F, the genes predictive in both ER-positive and ER-negative disease were used to cluster cases based on signature expression value (all cases, ER positive, and ER negative). The Kolmogorov–Smirnov (KS) statistic was used to determine the association of the signature value with clinical response. G, example of a gene predictive in both ER-negative and ER-positive breast cancer.

Close modal

Identification of genes with potent predictive power across treatment groups

To define the presence of predictive genes independent of treatment, a similar interaction analysis was performed between the TA, TFAC, and FAC datasets. Here, nine genes emerged that exhibited strong AUC values irrespective of treatment (Fig. 4A). These genes were evenly distributed for their association with pCR (DEK, DONSON, LBR, YEATS) and residual disease (BCAM, MTRN, FOXA1, SLC22A5, ANXA9; Fig. 4B). In tertile and ROC analysis using this 9 gene signature, there was clearly predictive value of these genes (Fig. 4B and C). The two genes with the strongest AUC associated with pCR (DEK) and residual disease (BCAM) were evaluated and shown to be differentially expressed in pCR versus residual disease cases, and associated with outcome irrespective of treatment or ER status (Fig. 4C–E and Supplementary Fig. S4).

Figure 4.

Definition of genes that harbor strong predictive value across treatment groups: A, intersection analysis of predictive genes across three treatment groups reveals only nine genes that are in common between the cohorts. Genes and their representative AUC values are shown for each cohort. B, the genes predictive in all treatment groups were used to cluster cases based on signature expression value. The Kolmogorov–Smirnov (KS) statistic was used to determine the association of the signature value with clinical response. The association of defined tertiles with pathological response is shown. C, the ROC predictive behavior is shown based on therapeutic intervention and ER status. D, the top performing gene associated with pCR, DEK, is shown as a single determinant of pathologic response. E, the top performing gene associated with residual disease, BCAM, is shown as a single determinant of pathologic response.

Figure 4.

Definition of genes that harbor strong predictive value across treatment groups: A, intersection analysis of predictive genes across three treatment groups reveals only nine genes that are in common between the cohorts. Genes and their representative AUC values are shown for each cohort. B, the genes predictive in all treatment groups were used to cluster cases based on signature expression value. The Kolmogorov–Smirnov (KS) statistic was used to determine the association of the signature value with clinical response. The association of defined tertiles with pathological response is shown. C, the ROC predictive behavior is shown based on therapeutic intervention and ER status. D, the top performing gene associated with pCR, DEK, is shown as a single determinant of pathologic response. E, the top performing gene associated with residual disease, BCAM, is shown as a single determinant of pathologic response.

Close modal

Top performing predictive makers retain predictive value in independent validation cohorts with distinct therapeutic regimens

The analysis performed supported the potential utilization of a small set of nine markers for the prediction of therapeutic response. To validate the performance, two additional cohorts were utilized, wherein patients were treated with AC and taxane or the microtubule poison ixabepilone or AC with or without taxane (summarized in Supplementary Table S2). In these cohorts, the nine genes identified were appropriately associated with pCR and residual disease as determined in the discovery cohorts (Fig. 5A). Importantly, the genes in these validation cohorts exhibited potent predictive value in tertile analysis and ROC analysis comparable with that observed in the discovery cohorts (Fig. 5B and C). Thus, the small panel of genes defined in this fashion could have utility in generally predicting therapeutic response.

Figure 5.

Validation of markers of chemotherapy response in independent cohorts. A, the nine genes predictive of response across treatments were used to cluster cases based on signature expression value in the two independent cohorts indicated. The Kolmogorov–Smirnov (KS) statistic was used to determine the association of the signature value with clinical response. B, the association of defined tertiles with pathologic response is shown. C, ROC curves of the association between gene expression and clinical outcomes are shown.

Figure 5.

Validation of markers of chemotherapy response in independent cohorts. A, the nine genes predictive of response across treatments were used to cluster cases based on signature expression value in the two independent cohorts indicated. The Kolmogorov–Smirnov (KS) statistic was used to determine the association of the signature value with clinical response. B, the association of defined tertiles with pathologic response is shown. C, ROC curves of the association between gene expression and clinical outcomes are shown.

Close modal

IHC validation of the robust predictive value of DEK and BCAM in an additional neoadjuvant cohort

Although molecular approaches are being progressively used in clinical care, IHC remains the mainstay of breast cancer biomarker analysis to guide treatment (e.g., ER staining). Therefore, we optimized IHC staining of the two top performing markers, DEK and BCAM. DEK stains nuclei and exhibited a range of staining from negative to high staining in 100% of cells (Fig. 6A). To quantify the staining, a modified histo-score was utilized where the intensity (0–3) was multiplied against the percentage of cells staining positive divided into quartile groups. This approach yielded a range of products from 0 to 12 that were used in tertiles and as a “semi-continuous” variable. BCAM exhibits membrane staining and the percentage of cells exhibiting robust membrane staining was utilized both with tertile cut points, and as a continuous variable. These marker levels were evaluated by ROC analysis (Supplementary Fig. S5). These data demonstrated potent predictive value of both markers. Using defined tertile cut points, there was a clear association of DEK and BCAM with pCR and residual disease, respectively (Fig. 6B). Importantly, in addition to pathologically defined pCR and residual disease, there is the quantitative association of response to chemotherapy as measured by residual cancer burden (RCB). The RCB is a measure of the cellularity and tumor bed of the tumor posttreatment resection. As shown, high DEK levels were strongly associated with a low RCB indicative of preferred response to chemotherapy (Fig. 6C). In contrast, high BCAM was associated with a high RCB indicative of a poor response to chemotherapy (Fig. 6C). To evaluate whether the predictive value of DEK and BCAM would add value beyond standard pathologic features of response, univariate and multivariate statistical analyses were performed (Fig. 6D). The data indicate that both BCAM and DEK provided added statistical value beyond grade and nodal status. Together, these data indicate that DEK and BCAM are associated with the response to chemotherapy and could serve as important independent markers of response to neoadjuvant chemotherapy.

Figure 6.

IHC analysis validate DEK and BCAM as predictive markers of response to neoadjuvant chemotherapy. A, the staining of DEK and BCAM was optimized in a clinical laboratory. Representative images of low and high staining of the markers are shown. B, association of high, medium, and low DEK and BCAM levels was evaluated as a function of pCR/residual disease. C, association of DEK and BCAM levels with the RCB. D, univariate and multivariate analysis of DEK and BCAM in the neoadjuvant cohort.

Figure 6.

IHC analysis validate DEK and BCAM as predictive markers of response to neoadjuvant chemotherapy. A, the staining of DEK and BCAM was optimized in a clinical laboratory. Representative images of low and high staining of the markers are shown. B, association of high, medium, and low DEK and BCAM levels was evaluated as a function of pCR/residual disease. C, association of DEK and BCAM levels with the RCB. D, univariate and multivariate analysis of DEK and BCAM in the neoadjuvant cohort.

Close modal

The improved treatment of cancer is tied to a more targeted approach to therapy. Typically, this is viewed in the context of drugs that have a specific molecular target; however, one of the most important areas in clinical care is to improve the delivery of chemotherapy. Across breast cancer subtypes, chemotherapy is routinely utilized to reduce the burden of disease in the neoadjuvant setting or prevent disease recurrence in the adjuvant setting (2, 12). Chemotherapy can provide long-term clinical benefit in breast cancer, as patients with tumors that experience a pCR to neoadjuvant chemotherapy have a particularly good long-term prognosis (7). This finding is highly relevant in the area of triple-negative breast cancer, wherein a pathologic complete response denotes the same prognosis as patients with ER-positive breast cancer (7). In contrast, tumors that progress while undergoing neoadjuvant chemotherapy have a poor prognosis, and are generally resistant to chemotherapy. For these reasons, it would be ideal to have a means to predict response to chemotherapy. This would allow chemotherapy to be targeted to patients whose tumors are most likely to benefit from such treatment, and consider surgery in combination with other modalities for patients that would be predicted to have an unfavorable response to treatment.

The primary objective of neoadjuvant chemotherapy is to improve surgical outcomes, and as with any systemic chemotherapy, it is used to reduce risk of distant recurrence. In principle, any tumor that is a candidate for adjuvant systemic chemotherapy could be treated with neoadjuvant chemotherapy. In breast cancer, the use of systemic adjuvant chemotherapy is largely evaluated on the basis of disease subtype. For example, triple-negative breast cancers will almost universally receive chemotherapy, whereas for ER-positive breast cancer, the molecular signatures (e.g., PAM50, Mammaprint, or OncotypeDx) are used to evaluate the benefit from adjuvant chemotherapy beyond endocrine therapy alone (27, 31–34). Here, we identified genes that had high-predictive value in ER-positive and ER-negative cohorts. Because of the frequent use of Herceptin in Her2-positive breast cancer, those cases were excluded from the analysis. In the consideration of ER-positive breast cancer, as may be expected, proliferation-associated genes were associated with therapeutic response because these genes differentiate luminal A and luminal B subtypes. In general, all proliferation signatures tested (i.e., GGI, Oncotype, CIN, RB signature) had similar activity in predicting response, although the RB signature had marginally better performance characteristics. Interestingly, such proliferation gene signatures were also effective within ER-negative breast cancer and individual proliferation genes harbored the top AUC. Unexpectedly, the individual genes associated with response were largely distinct from those in ER-positive disease. Genes involved in mitosis (e.g., Cyclin B2, MAD2L1, UBE2C) represented top ER-positive genes, whereas genes involved in DNA replication and DNA repair (MCM3, MSH2, FANCL) were most predictive in ER-negative breast cancer. Interspersed within the dominant cell-cycle/proliferation-associated gene programs were genes that have been implicated in the response to chemotherapy. For example, LDHB has been recently reported as a determinant of chemotherapy response and was identified herein (35). Despite the overall similarity of gene function, investigating the intersection between ER positive and ER negative revealed a small number of genes which maintained robust predictive power and suggested that a “general” set of markers could be identified that would be actionable within either the ER-positive or ER-negative subtypes. Interestingly, this cadre of genes was considerably overrepresented for association with pCR that was reflective of the fact that the vast majority of the top predictive genes in ER-negative breast cancer are associated with pCR not residual disease.

Most chemotherapy used in the neoadjuvant setting represents an anthracycline in combination with a taxane (TA), cytoxan (AC), or both (ACT; refs. 2, 12, 36). In the cohorts used herein, there was a significant difference in the top predictive markers between the TA and TFAC cohorts versus the FAC cohort. At present, it is impossible to determine whether this is a specific feature of the therapy utilized or represents some form of technical bias within the independent cohorts. However, the data from these analyses reinforce the concept that investigating multiple cohorts and treatments is important, and defining markers that are predictive under multiple independent contexts are presumptively critical for delineating utility under clinical conditions that may be “less than ideal.” For example, the interrogation of genes specific to the FAC cohort would yield many genes that have limited predictive power in other contexts (e.g., IFI16). In the analysis of the intersection between treatment groups, we defined only nine genes that were effective in all treatment settings. Importantly, we defined genes that were associated with cell cycle and pCR (e.g., DEK and DONSON) as well as genes associated with residual disease (e.g., BCAM and METRN). DEK is a nuclear oncogene that is implicated in DNA damage repair and apoptosis (37). LBR is the lamin B receptor, whereas YEATS2 and DONSON have largely unknown functions in mammalian cells. Of these genes, only a subset are cell cycle regulated (LBR, DONSON, and DEK); therefore, the inclusion of the other genes ostensibly provides complementary biologic information that would be expected to improve performance. Interestingly, none of these genes have been identified as being particularly relevant markers for the response to neoadjuvant chemotherapy. However, in our additional validation cohorts, these markers continued to be predictive of chemotherapy response. These findings contrasted with the analyses of Theraprint (Agendia Inc.) that has been presented as a means to judge potential response to chemotherapy. The Theraprint genes, and similar targeted panels, are based on individual studies of single genes and functional associations between proteins and drugs. An example is that the levels of ribonucleotide reductase will infer response to antimetabolites. We evaluated all genes within Theraprint individually and found that very few of these genes harbored significant predictive power (Supplementary Fig. S6). In this setting, ESR1 (estrogen receptor) was the most potent predictive marker. These findings could represent the fact that neoadjuvant chemotherapy is not delivered as a single agent; or more likely, that due to the complexity of cancer simple gene, inferences based on functional data do not uniformly hold true.

Over the last several years, new molecular tests have emerged to guide breast cancer treatment. Most notably, PAM50 has recently received approval, thereby joining OncotypeDx and MammaPrint in providing guidance to the treatment of ER-positive disease. Although there is growing acceptance for the use of RNA-based predictive tools, IHC remains the key tool in the context of evaluating the treatment of breast cancer. This is because the standard-of-care markers evaluated on the diagnostic core biopsy are performed by IHC. In recognition of this issue, we interrogated the two strongest performing markers across treatment groups, DEK and BCAM, in pretreatment diagnostic biopsies. The antibodies were optimized on clinical strainers and used on a cohort of consecutive cases at UTSW. The data demonstrated that both markers harbor strong predictive value and the simple combination of the two markers was particularly effective at predicting response to therapy. In addition to pCR/residual disease, tumors with low DEK and high BCAM had a particularly poor RCB score. RCB is associated with long-term prognosis and is a quantitative measure of residual disease. The finding that these markers were effective in terms of the fraction of residual disease is particularly important in defining patients for whom an alternative treatment is preferable. An RCB score of 3 indicates that the tumor was largely refractory to treatment, and perhaps alternative approaches to neoadjuvant chemotherapy should be effectively considered. Despite the robust testing here in multiple independent retrospective cohorts with multiple independent approaches, it is important to interrogate predictive power prospectively. To this end, an observational study of biomarkers in the response to neoajduvant chemotherapy is ongoing.

No potential conflicts of interest were disclosed.

Conception and design: A.K. Witkiewicz, E.S. Knudsen

Development of methodology: A.K. Witkiewicz

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): A.K. Witkiewicz

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): A.K. Witkiewicz, U. Balaji

Writing, review, and/or revision of the manuscript: A.K. Witkiewicz, E.S. Knudsen

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): A.K. Witkiewicz, U. Balaji, E.S. Knudsen

Study supervision: A.K. Witkiewicz, E.S. Knudsen

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Caudle
AS
,
Hunt
KK
. 
The neoadjuvant approach in breast cancer treatment: it is not just about chemotherapy anymore
.
Curr Opin Obstet Gynecol
2011
;
23
:
31
6
.
Available from
: http://www.ncbi.nlm.nih.gov/pubmed/21124221.
2.
von Minckwitz
G
,
Martin
M
. 
Neoadjuvant treatments for triple-negative breast cancer (TNBC)
.
Ann Oncol
2012
;
23
Suppl 6
:
vi35
9
.
3.
Freedman
RA
,
Winer
EP
. 
Adjuvant therapy for postmenopausal women with endocrine-sensitive breast cancer
.
Breast
2010
;
19
:
69
75
.
4.
Carlson
RW
,
Brown
E
,
Burstein
HJ
,
Gradishar
WJ
,
Hudis
CA
,
Loprinzi
C
, et al
NCCN task force report: adjuvant therapy for breast cancer
.
J Natl Compr Canc Netw
2006
;
4
Suppl 1
:
S1
26
.
5.
Carey
LA
. 
Directed therapy of subtypes of triple-negative breast cancer
.
Oncologist
2011
;
16
Suppl 1
:
71
8
.
6.
Carey
LA
,
Dees
EC
,
Sawyer
L
,
Gatti
L
,
Moore
DT
,
Collichio
F
, et al
The triple negative paradox: primary tumor chemosensitivity of breast cancer subtypes
.
Clin Cancer Res
2007
;
13
:
2329
34
.
7.
von Minckwitz
G
,
Untch
M
,
Blohmer
JU
,
Costa
SD
,
Eidtmann
H
,
Fasching
PA
, et al
Definition and impact of pathologic complete response on prognosis after neoadjuvant chemotherapy in various intrinsic breast cancer subtypes
.
J Clin Oncol
2012
;
30
:
1796
804
.
8.
Cheang
MC
,
Voduc
KD
,
Tu
D
,
Jiang
S
,
Leung
S
,
Chia
SK
, et al
Responsiveness of intrinsic subtypes to adjuvant anthracycline substitution in the NCIC.CTG MA.5 randomized trial
.
Clin Cancer Res
2012
;
18
:
2402
12
.
9.
Sotiriou
C
,
Desmedt
C
. 
Gene expression profiling in breast cancer
.
Ann Oncol
2006
;
17
Suppl 10
:
x259
62
.
10.
Desmedt
C
,
Sotiriou
C
. 
Proliferation: the most prominent predictor of clinical outcome in breast cancer
.
Cell Cycle
2006
;
5
:
2198
202
.
11.
Baselga
J
. 
Treatment of HER2-overexpressing breast cancer
.
Ann Oncol
2010
;
21
Suppl 7
:
vii36
40
.
12.
Esserman
LJ
,
Berry
DA
,
Cheang
MC
,
Yau
C
,
Perou
CM
,
Carey
L
, et al
Chemotherapy response and recurrence-free survival in neoadjuvant breast cancer depends on biomarker profiles: results from the I-SPY 1 TRIAL (CALGB 150007/150012; ACRIN 6657)
.
Breast Cancer Res Treat
2012
;
132
:
1049
62
.
13.
Liedtke
C
,
Mazouni
C
,
Hess
KR
,
Andre
F
,
Tordai
A
,
Mejia
JA
, et al
Response to neoadjuvant therapy and long-term survival in patients with triple-negative breast cancer
.
J Clin Oncol
2008
;
26
:
1275
81
.
14.
Ignatiadis
M
,
Singhal
SK
,
Desmedt
C
,
Haibe-Kains
B
,
Criscitiello
C
,
Andre
F
, et al
Gene modules and response to neoadjuvant chemotherapy in breast cancer subtypes: a pooled analysis
.
J Clin Oncol
2012
;
30
:
1996
2004
.
15.
Witkiewicz
AK
,
Ertel
A
,
McFalls
J
,
Valsecchi
ME
,
Schwartz
G
,
Knudsen
ES
. 
RB-pathway disruption is associated with improved response to neoadjuvant chemotherapy in breast cancer
.
Clin Cancer Res
2012
;
18
:
5110
22
. Available from: http://www.ncbi.nlm.nih.gov/pubmed/22811582.
16.
Hatzis
C
,
Pusztai
L
,
Valero
V
,
Booser
DJ
,
Esserman
L
,
Lluch
A
, et al
A genomic predictor of response and survival following taxane-anthracycline chemotherapy for invasive breast cancer
.
JAMA
2011;
305
:
1873
81
.
17.
Tabchy
A
,
Valero
V
,
Vidaurre
T
,
Lluch
A
,
Gomez
H
,
Martin
M
, et al
Evaluation of a 30-gene paclitaxel, fluorouracil, doxorubicin, and cyclophosphamide chemotherapy response predictor in a multicenter randomized trial in breast cancer
.
Clin Cancer Res
16
:
5351
61
.
18.
Balko
JM
,
Giltnane
J
,
Wang
K
,
Schwarz
LJ
,
Young
CD
,
Cook
RS
, et al
Molecular profiling of the residual disease of triple-negative breast cancers after neoadjuvant chemotherapy identifies actionable therapeutic targets
.
Cancer Discov
2013
;
4
:
232
45
.
19.
Silver
DP
,
Richardson
AL
,
Eklund
AC
,
Wang
ZC
,
Szallasi
Z
,
Li
Q
, et al
Efficacy of neoadjuvant Cisplatin in triple-negative breast cancer
.
J Clin Oncol
2010
;
28
:
1145
53
.
20.
Tordai
A
,
Wang
J
,
Andre
F
,
Liedtke
C
,
Yan
K
,
Sotiriou
C
, et al
Evaluation of biological pathways involved in chemotherapy response in breast cancer
.
Breast Cancer Res
2008
;
10
:
R37
.
21.
Ayers
M
,
Symmans
WF
,
Stec
J
,
Damokosh
AI
,
Clark
E
,
Hess
K
, et al
Gene expression profiles predict complete pathologic response to neoadjuvant paclitaxel and fluorouracil, doxorubicin, and cyclophosphamide chemotherapy in breast cancer
.
J Clin Oncol
2004
;
22
:
2284
93
.
22.
Horak
CE
,
Pusztai
L
,
Xing
G
,
Trifan
OC
,
Saura
C
,
Tseng
LM
, et al
Biomarker analysis of neoadjuvant doxorubicin/cyclophosphamide followed by ixabepilone or Paclitaxel in early-stage breast cancer
.
Clin Cancer Res
2013
;
19
:
1587
95
.
23.
Gianni
L
,
Eiermann
W
,
Semiglazov
V
,
Manikhas
A
,
Lluch
A
,
Tjulandin
S
, et al
Neoadjuvant chemotherapy with trastuzumab followed by adjuvant trastuzumab versus neoadjuvant chemotherapy alone, in patients with HER2-positive locally advanced breast cancer (the NOAH trial): a randomised controlled superiority trial with a parallel HER2-negative cohort
.
Lancet
2010
;
375
:
377
84
.
24.
Herschkowitz
JI
,
He
X
,
Fan
C
,
Perou
CM
. 
The functional loss of the retinoblastoma tumour suppressor is a common event in basal-like and luminal B breast carcinomas
.
Breast Cancer Res
2008
;
10
:
R75
.
25.
Loi
S
,
Haibe-Kains
B
,
Desmedt
C
,
Lallemand
F
,
Tutt
AM
,
Gillet
C
, et al
Definition of clinically distinct molecular subtypes in estrogen receptor-positive breast carcinomas through genomic grade
.
J Clin Oncol
2007
;
25
:
1239
46
.
26.
Carter
SL
,
Eklund
AC
,
Kohane
IS
,
Harris
LN
,
Szallasi
Z
. 
A signature of chromosomal instability inferred from gene expression profiles predicts clinical outcome in multiple human cancers
.
Nat Genet
2006
;
38
:
1043
8
.
27.
Paik
S
. 
Development and clinical utility of a 21-gene recurrence score prognostic assay in patients with early breast cancer treated with tamoxifen
.
Oncologist
2007
;
12
:
631
5
.
28.
van de Vijver
MJ
,
He
YD
,
van't Veer
LJ
,
Dai
H
,
Hart
AA
,
Voskuil
DW
, et al
A gene-expression signature as a predictor of survival in breast cancer
.
N Engl J Med
2002
;
347
:
1999
2009
.
29.
Ertel
A
,
Dean
JL
,
Rui
H
,
Liu
C
,
Witkiewicz
AK
,
Knudsen
KE
, et al
RB-pathway disruption in breast cancer: differential association with disease subtypes, disease-specific prognosis and therapeutic response
.
Cell Cycle
2010
;
9
:
4153
63
.
30.
Dunbier
AK
,
Anderson
H
,
Ghazoui
Z
,
Salter
J
,
Parker
JS
,
Perou
CM
, et al
Association between breast cancer subtypes and response to neoadjuvant anastrozole
.
Steroids
2011
;
76
:
736
40
.
31.
Paik
S
,
Tang
G
,
Shak
S
,
Kim
C
,
Baker
J
,
Kim
W
, et al
Gene expression and benefit of chemotherapy in women with node-negative, estrogen receptor-positive breast cancer
.
J Clin Oncol
2006
;
24
:
3726
34
.
32.
Nielsen
TO
,
Parker
JS
,
Leung
S
,
Voduc
D
,
Ebbert
M
,
Vickery
T
, et al
A comparison of PAM50 intrinsic subtyping with immunohistochemistry and clinical prognostic factors in tamoxifen-treated estrogen receptor-positive breast cancer
.
Clin Cancer Res
2010
;
16
:
5222
32
.
33.
Dowsett
M
,
Sestak
I
,
Lopez-Knowles
E
,
Sidhu
K
,
Dunbier
AK
,
Cowens
JW
, et al
Comparison of PAM50 risk of recurrence score with oncotype DX and IHC4 for predicting risk of distant recurrence after endocrine therapy
.
J Clin Oncol
2013
;
31
:
2783
90
.
34.
Sestak
I
,
Dowsett
M
,
Zabaglo
L
,
Lopez-Knowles
E
,
Ferree
S
,
Cowens
JW
, et al
Factors predicting late recurrence for estrogen receptor-positive breast cancer
.
J Natl Cancer Inst
2013
;
105
:
1504
11
.
35.
Dennison
JB
,
Molina
JR
,
Mitra
S
,
Gonzalez-Angulo
AM
,
Balko
JM
,
Kuba
MG
, et al
Lactate dehydrogenase B: a metabolic marker of response to neoadjuvant chemotherapy in breast cancer
.
Clin Cancer Res
2013
;
19
:
3703
13
.
36.
Gampenrieder
SP
,
Rinnerthaler
G
,
Greil
R
. 
Neoadjuvant chemotherapy and targeted therapy in breast cancer: past, present, and future
.
J Oncol
2013
;
2013
:
732047
.
37.
Privette Vinnedge
LM
,
Kappes
F
,
Nassar
N
,
Wells
SI
. 
Stacking the DEK: from chromatin topology to cancer stem cells
.
Cell Cycle
2013
;
12
:
51
66
.

Supplementary data