Abstract
In the August 15, 2004, issue of Clinical Cancer Research, Nielsen and colleagues demonstrated how a cancer subtype identified by gene expression profiling could be validated using a widely accessible technology (immunohistochemistry). This opened the door to large-scale studies of archival cohorts and clinical trials, which allowed detailed clinical understanding of a new genomic discovery. Clin Cancer Res; 21(8); 1779–81. ©2015 AACR.
See related article by Nielsen et al., Clin Cancer Res 2004;10(16) Aug 15, 2004;5367–74
The first genome-wide method applied to human tumors was gene expression profiling using DNA microarrays, a novel methodologic approach that created a great deal of excitement in the research community at the beginning of this millennium (1). Replication of findings, especially using independent specimens and different techniques, was critical, but few studies attempted this on any large scale. One of the most important early findings from the DNA microarray–based study of human tumors was the identification of the “basal-like” subtype of breast cancer, which, by gene expression analysis, looked completely different from EGFR–positive or HER2-positive breast cancers (2). To validate this discovery, we took two approaches: replication using additional cohorts of patients assayed on other DNA microarray platforms (3) and replication at the level of protein expression. Replication via protein expression was included within the initial article, in which we showed that the gene expression–defined basal-like breast tumors also typically stained positive by immunohistochemistry (IHC) for Keratins 5, 6, and/or 17; as noted in that initial publication, other investigators many years before had demonstrated the presence of rare breast tumors that stained positive for these Keratin markers (4, 5), which typically identify the basal layer of stratified epithelia. The first independent cohort protein validation of this CK5/6-positive breast tumor IHC result was facilitated by the technical development of tissue microarrays (TMA; ref. 6). Using this data-rich and new resource, we confirmed two results, namely that CK5/6-positive tumors were present at an approximately 10% frequency, and that they tended to have a worse outcome than CK5/6-negative tumors (7).
Building upon these TMA results, in the Nielsen and colleagues (8) article that is the focus of this commentary, we strove to (i) improve the basal-like tumor IHC-based definition; (ii) develop a more complete IHC subtyping algorithm for the three major gene expression subtypes (i.e., luminal, HER2-enriched, and basal-like); and (iii) relate these subtype definitions to breast cancer clinical outcomes using a large, well-annotated cohort of patient specimens derived from clinical trials. One of the key resources for Nielsen and colleagues (8) was the availability of a set of 115 breast tumors that were represented by both frozen and formalin-fixed and paraffin embedded (FFPE) materials. The frozen tumors were used for microarray analysis (i.e., gene expression), and the FFPE sections were used to screen a number of potential basal-like tumor markers by IHC, using antibodies that targeted the products of genes that were identified in the microarray studies (i.e., CK5/6, CK17, c-KIT, and EGFR). The key experiment was to compare the microarray and IHC data simultaneously and to objectively identify protein markers that were able to capture the gene expression–defined basal-like subtype.
Another key point that we reported in the article discussed herein (8) was that no single protein marker was sensitive enough to detect most basal-like tumors, and, thus, two markers (EGFR and CK5/6) were used together to capture the basal-like subtype from among the other patients with triple-negative breast cancer (TNBC). Having trained an IHC classifier against the emergent gold-standard expression microarray definition of basal-like breast cancer, our second key resource came into play: a set of 930 clinically annotated FFPE specimens that allowed an independent validation of the new IHC panel in relation to clinical outcome, in a series with considerably more power and longer follow-up than in any other study at that time. Results confirmed that basal-like patients had a worse outcome than non–basal-like patients particularly over the first 5 years after diagnosis. Thus was born the first-generation IHC subtyping panel for intrinsic subtyping. This translation of a genomic signature into an IHC assay was a key advance for the field of genomics in general as an important genomic finding was translated to another technology platform (IHC), and to another discipline (pathology), and showed the same clinical outcomes.
We continued to improve upon the IHC-based assay we reported in 2004 (8), and in 2008, we showed this new definition to be superior to simple triple-negative (i.e., ER−, PR−, and HER2−) status at predicting outcomes (9). This IHC panel was then further expanded to include an IHC-definition of luminal A versus luminal B, resulting in a six-biomarker assay (ER, PR, HER2, Ki-67, CK5/6, and EGFR; ref. 10). The beauty of these IHC-subtyping panels was that they provided a powerful, practical, and widely available tool (i.e., subtyping by IHC) to the worldwide research community that led directly to a number of key findings, prominently including the observation that the basal-like subtype differs in frequency by race and age, with basal-like tumors being more frequent in (i) African Americans (11, 12), (ii) Africans (Nigerians; ref. 13), and (iii) young women with breast cancer irrespective of race (11). Ultimately, this IHC-subtyping panel was adopted by the St. Gallen Consensus Committee as a means for therapeutic and prognostic stratification (14). With a further refinement to include a progesterone receptor cutoff point to optimize the identification of luminal B tumors (15), this IHC definition is still in the St. Gallen guidelines of 2013 (16).
An especially important consequence of the development of these IHC panels for breast cancer intrinsic subtyping is that it allowed subtyping to be performed on studies that only existed as FFPE archives in an era when these materials were not amenable to nucleic acid–based assays, and it brought a relatively simple means of assessing clinically relevant subtypes to the world community, as many countries cannot afford the more expensive multigene tests. Thus, a genomic finding of potential clinical relevance was translated from one technology to another, further validated, and adopted by the community as a common language for addressing breast cancer heterogeneity. Work in other cancer types has also adopted this strategy to create IHC panels to identify genomically defined subtypes in a widely accessible fashion.
In subsequent years, many other laboratories developed their own IHC surrogates for basal-like tumors, but very few actually compared their IHC results with the gene expression–defined subtypes; this point should be kept in mind even today as results are interpreted when comparing different IHC-based definitions of breast tumor subtypes. Drawing on this literature and on gene expression data again, we more recently completed a comprehensive survey of 46 proposed IHC biomarkers of basal-like breast cancers and found that few could actually outperform the ER- and HER2-negative, CK5/6, and/or EGFR-positive definition, although expression of Nestin and loss of INPP4B showed good sensitivity and excellent specificity (17).
We must also point out, however, that at the same time these IHC panels were being developed, so was a clinically applicable gene expression assay for breast cancer intrinsic subtyping. The qRT-PCR–based PAM50 50-gene assay, first described by Parker and colleagues (18), was designed to be compatible with RNA coming from FFPE materials, and to incorporate RNA measurements, specifically including those genes we validated in our 2004 article (8). From the PAM50 assay also arose the “risk of recurrence” (ROR) score for predicting prognosis of patients with breast cancer. In a later article we further improved the ROR score to include a term for proliferation, and note this here because we thoroughly tested this gene expression–based classification versus the IHC-based classification (19). The result was that the gene expression–based intrinsic subtyping method proved superior in all ways, including better survival predictions and reproducibility; this finding of the superiority of gene expression assays versus IHC panels, both designed for the same purpose, has also held up on many other datasets (20).
This finding makes sense from a simple mathematical perspective because multiple redundant markers will always outperform a much more limited marker subset (50 vs. 6), especially where the measurement method for the 50 has a larger dynamic range of expression and a higher level of quantitative precision compared with the 6. Furthermore, IHC techniques incorporate several methodologic steps that are difficult to standardize, followed by a subjective visual interpretation. Not surprisingly, this leads to poor analytic reproducibility for single markers (20), let alone for combined sets of IHC markers. In contrast, the analytic validity of carefully designed gene expression–based tests applied to FFPE material has been proven for breast cancer prognostic assays using qRT-PCR (recurrence score, ref. 21; EndoPredict, ref. 22) and the Nanostring-based Prosigna assay (23), the latter of which has received FDA approval as a multianalyte in vitro diagnostic test that is ultimately grounded in the research described in our 2004 article (8). Thus, when possible, we always advocate the use of the multigene assays, given their greater accuracy and reproducibility compared with IHC; however, when gene expression assays are not available, the current IHC panel can be used for research and/or clinical purposes when the proper proficiency testing is also utilized.
In closing, the Nielsen and colleagues (8) article has made a significant impact upon the fields of genomics, breast cancer, and even patient care. We did not anticipate that this article would receive over 1,900 citations and, ultimately, lay the groundwork for an adopted standard for assessing breast cancer heterogeneity and a regulatory-approved clinical test. Personally, this also represented prominent authorship in a widely read journal at a stage when our independent careers were just beginning. We wish to thank the AACR and Clinical Cancer Research for the publication of this article and many others since, and hope that this story serves to illustrate one roadmap for translation of a genomic finding into an assay that can affect patient care.
Disclosure of Potential Conflicts of Interest
T.O. Nielsen reports receiving a commercial research grant from NanoString Technologies, has ownership interest (including patents) in Bioclassifier LLC, and is a consultant/advisory board member for NanoString Technologies. C.M. Perou has ownership interest (including patents) in and is a consultant/advisory board member for Bioclassifier LLC. No other potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: T.O. Nielsen, C.M. Perou
Writing, review, and/or revision of the manuscript: T.O. Nielsen, C.M. Perou