In the August 15, 2004, issue of Clinical Cancer Research, Nielsen and colleagues demonstrated how a cancer subtype identified by gene expression profiling could be validated using a widely accessible technology (immunohistochemistry). This opened the door to large-scale studies of archival cohorts and clinical trials, which allowed detailed clinical understanding of a new genomic discovery. Clin Cancer Res; 21(8); 1779–81. ©2015 AACR.

See related article by Nielsen et al., Clin Cancer Res 2004;10(16) Aug 15, 2004;5367–74

The first genome-wide method applied to human tumors was gene expression profiling using DNA microarrays, a novel methodologic approach that created a great deal of excitement in the research community at the beginning of this millennium (1). Replication of findings, especially using independent specimens and different techniques, was critical, but few studies attempted this on any large scale. One of the most important early findings from the DNA microarray–based study of human tumors was the identification of the “basal-like” subtype of breast cancer, which, by gene expression analysis, looked completely different from EGFR–positive or HER2-positive breast cancers (2). To validate this discovery, we took two approaches: replication using additional cohorts of patients assayed on other DNA microarray platforms (3) and replication at the level of protein expression. Replication via protein expression was included within the initial article, in which we showed that the gene expression–defined basal-like breast tumors also typically stained positive by immunohistochemistry (IHC) for Keratins 5, 6, and/or 17; as noted in that initial publication, other investigators many years before had demonstrated the presence of rare breast tumors that stained positive for these Keratin markers (4, 5), which typically identify the basal layer of stratified epithelia. The first independent cohort protein validation of this CK5/6-positive breast tumor IHC result was facilitated by the technical development of tissue microarrays (TMA; ref. 6). Using this data-rich and new resource, we confirmed two results, namely that CK5/6-positive tumors were present at an approximately 10% frequency, and that they tended to have a worse outcome than CK5/6-negative tumors (7).

Building upon these TMA results, in the Nielsen and colleagues (8) article that is the focus of this commentary, we strove to (i) improve the basal-like tumor IHC-based definition; (ii) develop a more complete IHC subtyping algorithm for the three major gene expression subtypes (i.e., luminal, HER2-enriched, and basal-like); and (iii) relate these subtype definitions to breast cancer clinical outcomes using a large, well-annotated cohort of patient specimens derived from clinical trials. One of the key resources for Nielsen and colleagues (8) was the availability of a set of 115 breast tumors that were represented by both frozen and formalin-fixed and paraffin embedded (FFPE) materials. The frozen tumors were used for microarray analysis (i.e., gene expression), and the FFPE sections were used to screen a number of potential basal-like tumor markers by IHC, using antibodies that targeted the products of genes that were identified in the microarray studies (i.e., CK5/6, CK17, c-KIT, and EGFR). The key experiment was to compare the microarray and IHC data simultaneously and to objectively identify protein markers that were able to capture the gene expression–defined basal-like subtype.

Another key point that we reported in the article discussed herein (8) was that no single protein marker was sensitive enough to detect most basal-like tumors, and, thus, two markers (EGFR and CK5/6) were used together to capture the basal-like subtype from among the other patients with triple-negative breast cancer (TNBC). Having trained an IHC classifier against the emergent gold-standard expression microarray definition of basal-like breast cancer, our second key resource came into play: a set of 930 clinically annotated FFPE specimens that allowed an independent validation of the new IHC panel in relation to clinical outcome, in a series with considerably more power and longer follow-up than in any other study at that time. Results confirmed that basal-like patients had a worse outcome than non–basal-like patients particularly over the first 5 years after diagnosis. Thus was born the first-generation IHC subtyping panel for intrinsic subtyping. This translation of a genomic signature into an IHC assay was a key advance for the field of genomics in general as an important genomic finding was translated to another technology platform (IHC), and to another discipline (pathology), and showed the same clinical outcomes.

We continued to improve upon the IHC-based assay we reported in 2004 (8), and in 2008, we showed this new definition to be superior to simple triple-negative (i.e., ER, PR, and HER2) status at predicting outcomes (9). This IHC panel was then further expanded to include an IHC-definition of luminal A versus luminal B, resulting in a six-biomarker assay (ER, PR, HER2, Ki-67, CK5/6, and EGFR; ref. 10). The beauty of these IHC-subtyping panels was that they provided a powerful, practical, and widely available tool (i.e., subtyping by IHC) to the worldwide research community that led directly to a number of key findings, prominently including the observation that the basal-like subtype differs in frequency by race and age, with basal-like tumors being more frequent in (i) African Americans (11, 12), (ii) Africans (Nigerians; ref. 13), and (iii) young women with breast cancer irrespective of race (11). Ultimately, this IHC-subtyping panel was adopted by the St. Gallen Consensus Committee as a means for therapeutic and prognostic stratification (14). With a further refinement to include a progesterone receptor cutoff point to optimize the identification of luminal B tumors (15), this IHC definition is still in the St. Gallen guidelines of 2013 (16).

An especially important consequence of the development of these IHC panels for breast cancer intrinsic subtyping is that it allowed subtyping to be performed on studies that only existed as FFPE archives in an era when these materials were not amenable to nucleic acid–based assays, and it brought a relatively simple means of assessing clinically relevant subtypes to the world community, as many countries cannot afford the more expensive multigene tests. Thus, a genomic finding of potential clinical relevance was translated from one technology to another, further validated, and adopted by the community as a common language for addressing breast cancer heterogeneity. Work in other cancer types has also adopted this strategy to create IHC panels to identify genomically defined subtypes in a widely accessible fashion.

In subsequent years, many other laboratories developed their own IHC surrogates for basal-like tumors, but very few actually compared their IHC results with the gene expression–defined subtypes; this point should be kept in mind even today as results are interpreted when comparing different IHC-based definitions of breast tumor subtypes. Drawing on this literature and on gene expression data again, we more recently completed a comprehensive survey of 46 proposed IHC biomarkers of basal-like breast cancers and found that few could actually outperform the ER- and HER2-negative, CK5/6, and/or EGFR-positive definition, although expression of Nestin and loss of INPP4B showed good sensitivity and excellent specificity (17).

We must also point out, however, that at the same time these IHC panels were being developed, so was a clinically applicable gene expression assay for breast cancer intrinsic subtyping. The qRT-PCR–based PAM50 50-gene assay, first described by Parker and colleagues (18), was designed to be compatible with RNA coming from FFPE materials, and to incorporate RNA measurements, specifically including those genes we validated in our 2004 article (8). From the PAM50 assay also arose the “risk of recurrence” (ROR) score for predicting prognosis of patients with breast cancer. In a later article we further improved the ROR score to include a term for proliferation, and note this here because we thoroughly tested this gene expression–based classification versus the IHC-based classification (19). The result was that the gene expression–based intrinsic subtyping method proved superior in all ways, including better survival predictions and reproducibility; this finding of the superiority of gene expression assays versus IHC panels, both designed for the same purpose, has also held up on many other datasets (20).

This finding makes sense from a simple mathematical perspective because multiple redundant markers will always outperform a much more limited marker subset (50 vs. 6), especially where the measurement method for the 50 has a larger dynamic range of expression and a higher level of quantitative precision compared with the 6. Furthermore, IHC techniques incorporate several methodologic steps that are difficult to standardize, followed by a subjective visual interpretation. Not surprisingly, this leads to poor analytic reproducibility for single markers (20), let alone for combined sets of IHC markers. In contrast, the analytic validity of carefully designed gene expression–based tests applied to FFPE material has been proven for breast cancer prognostic assays using qRT-PCR (recurrence score, ref. 21; EndoPredict, ref. 22) and the Nanostring-based Prosigna assay (23), the latter of which has received FDA approval as a multianalyte in vitro diagnostic test that is ultimately grounded in the research described in our 2004 article (8). Thus, when possible, we always advocate the use of the multigene assays, given their greater accuracy and reproducibility compared with IHC; however, when gene expression assays are not available, the current IHC panel can be used for research and/or clinical purposes when the proper proficiency testing is also utilized.

In closing, the Nielsen and colleagues (8) article has made a significant impact upon the fields of genomics, breast cancer, and even patient care. We did not anticipate that this article would receive over 1,900 citations and, ultimately, lay the groundwork for an adopted standard for assessing breast cancer heterogeneity and a regulatory-approved clinical test. Personally, this also represented prominent authorship in a widely read journal at a stage when our independent careers were just beginning. We wish to thank the AACR and Clinical Cancer Research for the publication of this article and many others since, and hope that this story serves to illustrate one roadmap for translation of a genomic finding into an assay that can affect patient care.

T.O. Nielsen reports receiving a commercial research grant from NanoString Technologies, has ownership interest (including patents) in Bioclassifier LLC, and is a consultant/advisory board member for NanoString Technologies. C.M. Perou has ownership interest (including patents) in and is a consultant/advisory board member for Bioclassifier LLC. No other potential conflicts of interest were disclosed.

Conception and design: T.O. Nielsen, C.M. Perou

Writing, review, and/or revision of the manuscript: T.O. Nielsen, C.M. Perou

1.
Lander
ES
. 
Array of hope
.
Nat Genet
1999
;
21
:
3
4
.
2.
Perou
CM
,
Sorlie
T
,
Eisen
MB
,
van de Rijn
M
,
Jeffrey
SS
,
Rees
CA
, et al
Molecular portraits of human breast tumours
.
Nature
2000
;
406
:
747
52
.
3.
Sorlie
T
,
Tibshirani
R
,
Parker
J
,
Hastie
T
,
Marron
JS
,
Nobel
A
, et al
Repeated observation of breast tumor subtypes in independent gene expression data sets
.
Proc Natl Acad Sci U S A
2003
;
100
:
8418
23
.
4.
Bosch
FX
,
Leube
RE
,
Achtstatter
T
,
Moll
R
,
Franke
WW
. 
Expression of simple epithelial type cytokeratins in stratified epithelia as detected by immunolocalization and hybridization in situ
.
J Cell Biol
1988
;
106
:
1635
48
.
5.
Moll
R
,
Franke
WW
,
Schiller
DL
,
Geiger
B
,
Krepler
R
. 
The catalog of human cytokeratins: patterns of expression in normal epithelia, tumors and cultured cells
.
Cell
1982
;
31
:
11
24
.
6.
Kallioniemi
OP
,
Wagner
U
,
Kononen
J
,
Sauter
G
. 
Tissue microarray technology for high-throughput molecular profiling of cancer
.
Hum Mol Genet
2001
;
10
:
657
62
.
7.
van de Rijn
M
,
Perou
CM
,
Tibshirani
R
,
Haas
P
,
Kallioniemi
O
,
Kononen
J
, et al
Expression of cytokeratins 17 and 5 identifies a group of breast carcinomas with poor clinical outcome
.
Am J Pathol
2002
;
161
:
1991
6
.
8.
Nielsen
TO
,
Hsu
FD
,
Jensen
K
,
Cheang
M
,
Karaca
G
,
Hu
Z
, et al
Immunohistochemical and clinical characterization of the basal-like subtype of invasive breast carcinoma
.
Clin Cancer Res
2004
;
10
:
5367
74
.
9.
Cheang
MC
,
Voduc
D
,
Bajdik
C
,
Leung
S
,
McKinney
S
,
Chia
SK
, et al
Basal-like breast cancer defined by five biomarkers has superior prognostic value than triple-negative phenotype
.
Clin Cancer Res
2008
;
14
:
1368
76
.
10.
Cheang
MC
,
Chia
SK
,
Voduc
D
,
Gao
D
,
Leung
S
,
Snider
J
, et al
Ki67 index, HER2 status, and prognosis of patients with luminal B breast cancer
.
J Natl Cancer Inst
2009
;
101
:
736
50
.
11.
Millikan
RC
,
Newman
B
,
Tse
CK
,
Moorman
PG
,
Conway
K
,
Smith
LV
, et al
Epidemiology of basal-like breast cancer
.
Breast Cancer Res Treat
2008
;
109
:
123
39
.
12.
Carey
LA
,
Perou
CM
,
Livasy
CA
,
Dressler
LG
,
Cowan
D
,
Conway
K
, et al
Race, breast cancer subtypes, and survival in the Carolina Breast Cancer Study
.
JAMA
2006
;
295
:
2492
502
.
13.
Huo
D
,
Ikpatt
F
,
Khramtsov
A
,
Dangou
JM
,
Nanda
R
,
Dignam
J
, et al
Population differences in breast cancer: survey in indigenous African women reveals over-representation of triple-negative breast cancer
.
J Clin Oncol
2009
;
27
:
4515
21
.
14.
Goldhirsch
A
,
Wood
WC
,
Coates
AS
,
Gelber
RD
,
Thurlimann
B
,
Senn
HJ
, et al
Strategies for subtypes—dealing with the diversity of breast cancer: highlights of the St. Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2011
.
Ann Oncol
2011
;
22
:
1736
47
.
15.
Prat
A
,
Cheang
MC
,
Martin
M
,
Parker
JS
,
Carrasco
E
,
Caballero
R
, et al
Prognostic significance of progesterone receptor-positive tumor cells within immunohistochemically defined luminal a breast cancer
.
J Clin Oncol
2013
;
31
:
203
9
.
16.
Goldhirsch
A
,
Winer
EP
,
Coates
AS
,
Gelber
RD
,
Piccart-Gebhart
M
,
Thurlimann
B
, et al
Personalizing the treatment of women with early breast cancer: highlights of the St Gallen international expert consensus on the primary therapy of early breast cancer 2013
.
Ann Oncol
2013
;
24
:
2206
23
.
17.
Won
JR
,
Gao
D
,
Chow
C
,
Cheng
J
,
Lau
SY
,
Ellis
MJ
, et al
A survey of immunohistochemical biomarkers for basal-like breast cancer against a gene expression profile gold standard
.
Mod Pathol
2013
;
26
:
1438
50
.
18.
Parker
JS
,
Mullins
M
,
Cheang
MC
,
Leung
S
,
Voduc
D
,
Vickery
T
, et al
Supervised risk predictor of breast cancer based on intrinsic subtypes
.
J Clin Oncol
2009
;
27
:
1160
7
.
19.
Nielsen
TO
,
Parker
JS
,
Leung
S
,
Voduc
D
,
Ebbert
M
,
Vickery
T
, et al
A comparison of PAM50 intrinsic subtyping with immunohistochemistry and clinical prognostic factors in tamoxifen-treated estrogen receptor-positive breast cancer
.
Clin Cancer Res
2010
;
16
:
5222
32
.
20.
Prat
A
,
Ellis
MJ
,
Perou
CM
. 
Practical implications of gene-expression-based assays for breast oncologists
.
Nat Rev Clin Oncol
2012
;
9
:
48
57
.
21.
Cronin
M
,
Sangli
C
,
Liu
ML
,
Pho
M
,
Dutta
D
,
Nguyen
A
, et al
Analytical validation of the Oncotype DX genomic diagnostic test for recurrence prognosis and therapeutic response prediction in node-negative, estrogen receptor-positive breast cancer
.
Clin Chem
2007
;
53
:
1084
91
.
22.
Kronenwett
R
,
Bohmann
K
,
Prinzler
J
,
Sinn
BV
,
Haufe
F
,
Roth
C
, et al
Decentral gene expression analysis: analytical validation of the Endopredict genomic multianalyte breast cancer prognosis test
.
BMC Cancer
2012
;
12
:
456
.
23.
Nielsen
T
,
Wallden
B
,
Schaper
C
,
Ferree
S
,
Liu
S
,
Gao
D
, et al
Analytical validation of the PAM50-based Prosigna breast cancer prognostic gene signature assay and nCounter analysis system using formalin-fixed paraffin-embedded breast tumor specimens
.
BMC Cancer
2014
;
14
:
177
.