Patient-derived xenografts (PDXs) recapitulate intratumoral spatial heterogeneity and simulate a tumor microenvironment in which human immune and stromal cells in the PDX are replaced over passages by murine cells partially lacking immune function. Histological imaging enables exploring the spatial heterogeneity and dynamics of cancer, stromal, and immune cell interactions as correlates of tumor stage and therapeutic response over passages. We created a repository of curated, haematoxylin and eosin (H&E) images as a community resource for addressing these questions.

Images were generated at five sites within the NCI’s PDX Development and Trial Centers Research Network (PDXNet) and the NCI Patient-Derived Models Repository. Over 900 images, including 739 from PDXs and 190 from paired patients, are hosted on the Seven Bridges Genomics Cancer Genomics Cloud. They represent 42 cancer subtypes, including breast cancer (n=134), colon adenocarcinoma (COAD; n=94), pancreatic cancer (n=87), lung adenocarcinoma (LUAD; n=80), melanoma (n=71), and squamous cell lung cancer (LUSC; n=65). Paired human/PDX images are available for each of these cancers. Human and/or PDX images generated following patient treatment are available for 37 of the subtypes. Most images are from early passages (P0: 158; P1: 292; P2: 152; P3: 69; >P3: 55). Annotations include sex, age, race, ethnicity, and, for most images, pathological assessment of tissue-level percent cancer, stromal, and necrotic cell content (n=639) and tumor stage (n=650). RNA and exome sequencing data are available for 99 and 228 images, respectively, matched at the patient or sample level.

Quality control was performed using HistoQC. Cells were segmented and labeled as neoplastic, necrotic, immune, stromal, or other using Hover-Net and predictions of total neoplastic cell area correlated with whole-slide pathological assessment of cancer cell percentage (COAD: r=0.51; LUSC: r=0.59). HD-Staining, another classification approach, was applied to a subset of images and our clinical annotations will facilitate validation of this and related methods. Features of 512 x 512 pixel tiles were computed using the Inception V3 convolutional neural network pre-trained on ImageNet. Unsupervised clustering of these features demonstrate inter-patient heterogeneity within pathologist-annotated tumor regions. A classifier developed using pathologist-annotated cancer, stromal, and necrotic regions and trained on the features in LUSC images (n=10 images) achieved a cross-validation accuracy of 96% for cancer tiles across (n=5) LUAD images. Accuracy was lower for stromal classification (90%), likely reflecting current limitations of our small, but growing, labeled training set.

Our repository of clinically-annotated PDX H&E images should aid the community in studying spatial heterogeneity and in training deep learning-based image analysis methods.

Citation Format: Brian S. White, Xingyi Woo, Soner Koc, Todd Sheridan, Steven B. Neuhauser, Akshat M. Savaliya, Lacey E. Dobrolecki, John D. Landua, Matthew H. Bailey, Maihi Fujita, Kurt W. Evans, Bingliang Fang, Junya Fujimoto, Maria Gabriela Raso, Shidan Wang, Guanghua Xiao, Yang Xie, Sherri R. Davies, Ryan C. Fields, R Jay Mashl, Jacqueline L. Mudd, Yeqing Chen, Min Xiao, Xiaowei Xu, Melinda G. Hollingshead, Shahanawaz Jiwani, PDXNet Consortium, Yvonne A. Evrard, Tiffany A. Wallace, Jeffrey A. Moscow, James H. Doroshow, Nicholas Mitsiades, Salma Kaochar, Chong-xian Pan, Moon S. Chen, Luis G. Carvajal-Carmona, Alana L. Welm, Bryan E. Welm, Michael T. Lewis, Ramaswamy Govindan, Li Ding, Shunqiang Li, Meenhard Herlyn, Michael A. Davies, Jack A. Roth, Funda Meric-Bernstam, Carol J. Bult, Brandi Davis-Dusenbery, Dennis A. Dean, Jeffrey H. Chuang. A repository of PDX histology images for exploring spatial heterogeneity and cancer dynamics [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 1202.