Background: Breast cancer is the second most common cancer in women. In 2022, it accounted for 15% of total new cancer cases and is the number four cause of death among all cancer types. To benefit from precision medicine, distinguishing molecular subtypes for prognosis and treatment in a clinical setting is essential. While intrinsic subtype classification from NGS results of patients is well established, the approach has not been comprehensively described for patient-derived xenograft (PDX) models, which have been shown to be powerful in translational research. The National Cancer Institute's Patient-Derived Models Repository (NCI PDMR; provides rich information in developing the method.

Materials and Methods: Normalized gene expression data of breast cancer PDX and patient specimens (originators) were extracted using tximport and DESeq2 based on RNA-seq analysis. The immunohistochemistry (IHC) was used to determine the status of ER, PR and HER2 receptor expression in these tumor specimens. The PAM50 classification was performed by the R package Genefu. For further analysis, the PAM50 centroids for all 5 subtypes were also obtained from Genefu.

Results: Using the RNA-seq data from 43 PDX models (180 PDX samples, 4~6 samples/model), we were able to predict subtypes at the model level based on the PAM50 method: There are 1 Luminal A subtypes, 5 Luminal B; 6 Her2; 30 Basal and 1 Normal, which encompasses the whole spectrum of PAM50. Thirty originators were also included and there are 8 Luminal A, 9 Luminal B, 2 Her2 and 11 Basal. With the matched 11 originators and the PDX models, 91% of their predicted subtypes are identical; 0.80 Cohen’s kappa was obtained, indicating high inter-rater agreement. We also described subsequent analysis with IHC data-based subtypes. For the 10 originators having IHC-based subtypes, 90% agreement was observed; for 24 PDX models with IHC data, 88% was observed. Of all the 180 PDX samples, 33 of the 43 PDX models (77%) have consistent predicted PAM50 molecular subtypes across different passages and lineages. Within the discordant samples, we observed cases such as a mixture of luminal B and Basal, which can be reasonably interpreted by AR positive signal from IHC. The discrepancy encourages further PDX subclassification from the Basal subtype.

Conclusions: Using our high-throughput gene expression profiles from many patients and samples from patient derived models, we have demonstrated the feasibility of applying classic PAM50 classification algorithm, which was originally developed with microarray data, to be able to recognize the expression signals from our RNA-seq data. Overall, this study should set a primer for the identification of PDX-based subtypes, starting from breast cancer.

Citation Format: Peter I. Wu, Lindsay Dutko, Shahanawaz Jiwani, Li Chen, Biswajit Das, Ting-Chia Chang, Yvonne A. Evrard, Chris A. Karlovich, Alyssa Chapman, Brandie Fullmer, Ashley Hayes, Ruth Thornton, Nikitha Nair, Kelly Benauer, Gloryvee Rivera, Thomas Forbes, John Carter, Suzanne Borgel, Tiffanie Miner, Chelsea McGlynn, Justine Mills, Shannon Uzelac, Tia Shearer, Lauren Hicks, Michelle Norris, Carley Border, Sergio Alcoser, Thomas Walsh, Michael Mullendore, Michelle Eugeni, Dianne Newton, Melinda G. Hollingshead, P. M. Williams, James H. Doroshow. Molecular subclassification of NCI PDMR breast cancer models using PAM50 gene expression signature [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 2050.