Background: The diverse breast cancer molecular subtypes have different clinicopathological characteristics and outcomes. Here, we assessed the correlation between microarray-defined and author-reported molecular subtypes using publicly available breast cancer datasets.

Methods: GEO was searched for datasets where the authors determined molecular subtypes. The molecular subtype was re-calculated for each patient using the StGallen guidelines by assessing the microarray-based expression of ER, HER2 and MKI67 (Basal: ER negative + HER2 negative, Luminal A: ER positive + HER2 negative + low MKI67, HER2 enriched: HER2 positive + ER negative, Luminal B: ER positive + HER2 positive and ER positive + HER2 negative + high MKI67). The thresholds used were 500 for ER (probe set 205225_at), 4800 for HER2 (216836_s_at) and 470 for MKI67 (212021_s_at). Sensitivity and specificity were calculated for each subtype separately.

Results: Molecular subtype was published in all together six datasets (GSE1456, GSE21653, GSE25066, GSE20711, GSE31519 and GSE17907) for 983 patients. In these, 380, 306, 169 and 128 were Basal, Luminal A, Luminal B and HER2 enriched, respectively. The microarray-based molecular subtype determination resulted in a sensitivity of 0.70, 0.55, 0.63 and 0.38 and a specificity of 0.96, 0.87, 0.67 and 0.99 for Basal, Luminal A, Luminal B and HER2 enriched subtypes, respectively. Finally, the option to filter for molecular subtypes was implemented into our online biomarker validation platform at

Discussion. Microarray data provided highest sensitivity and specificity to independent subtype classification for Basal tumors, while the luminal B subtype displayed the highest discordance. Our registration-free online service enables the validation of gene expression based biomarkers in each subtype separately.

Citation Information: Cancer Res 2012;72(24 Suppl):Abstract nr P1-07-14.