Introduction Machine learning models offer the potential to provide rich, quantitative characterizations of the tumor and tumor micro-environment (TME); however, historically it has been difficult to generalize trained models to new sets of clinical trial samples from trials not used in training. Here we evaluate the ability to deploy a machine learning based model (ML Model) for the identification of non-small cell lung tissue regions and lymphocytes within the tumor and TME on H&E stained images from clinical trial samples with no additional model training. Methods The ML model was previously trained on both squamous cell carcinoma and lung adenocarcinoma non-small cell lung carcinoma (NSCLC) samples from commercial and clinical datasets. The ML model was deployed on an AstraZeneca-sponsored phase II clinical trial of novel anti-cancer agents in patients with metastatic NSCLC. In order to validate the predictions of lymphocytes from the H&E stained images, we established a reference dataset for manual vs digital concordance consisting of 300, 150 × 150-micron–sized “frames” sampled from the trial dataset, removing frames of inadequate tissue quality or with presence of artifacts. For each frame, we collected exhaustive annotations from 5 pathologists to produce quantitative estimates of lymphocytes. Altogether, 43,932 annotations were collected and used to compute pathologist consensus scores for each frame. These scores were then correlated with each individual pathologist (inter-reader agreement) and with the PathAI-derived automated scores for evaluation of manual vs digital agreement. Results The PathAI system was successfully deployed on 169 H&E stained images from the phase II clinical trial to exhaustively identify all tumor associated lymphocytes from each whole slide image. In total, PathAI classified 2,859,796 lymphocytes, with an average number of 16,922 lymphocytes per image. We used frames-based validation to determine the correlation between the automated scoring and consensus scoring from pathologists hand labeling individual lymphocytes within image frames. The PathAI platform showed strong correlation between reference-based consensus scores (r2 = 0.84, CI [0.80 – 0.87]) and the ML model, which was similar to the level of agreement achieved between individual pathologists (r2 = 0.80, CI [0.76 – 0.85]). Conclusions The PathAI system showed strong generalizability for the identification of lymphocytes within the tumor and TME from H&E stained images from NSCLC clinical trial samples. These results suggest the power of deploying ML-based systems broadly for the automated, single cell resolution characterization of disease pathology from clinical trial material.

Citation Format: Ben Glass, Laura Dillon, Guillaume Chhor, Sara Hoffman, Varsha Chinnaobireddy, Sai Chowdary Gullapally, Andy Beck, Jason Hipp. Robust deployment of ML models quantifying the H&E tumor microenvironment in NSCLC subjects from an AstraZeneca-sponsored phase II clinical trial [abstract]. In: Proceedings of the AACR Virtual Special Conference on Artificial Intelligence, Diagnosis, and Imaging; 2021 Jan 13-14. Philadelphia (PA): AACR; Clin Cancer Res 2021;27(5_Suppl):Abstract nr PO-072.