Background: Tertiary lymphoid structures (TLS) are vascularized lymphocyte aggregates in the tumor microenvironment (TME) that correlate with better patient outcomes. Previous studies identified a 12 chemokine gene expression signature associated with disease progression and the type and degree of TLS. These signatures could provide insight important for clinical decision making during pathologic evaluation, but predicting gene expression from whole slide images (WSI) may be impeded by low prediction accuracy and lack of interpretability. Here we report an artificial intelligence (AI)-based, state-of-the-art workflow to predict the 12-chemokine TLS gene signature from lung cancer WSI, and identify histological features relevant to model predictions.

Methods: Models were trained using 538 cases of paired lung cancer WSI and mRNA-seq expression data (The Cancer Genome Atlas). Cell and tissue classifiers, based on convolutional neural networks (CNN) were trained on WSI, and a graph neural network (GNN) model that leverages the relative spatial arrangement of the CNN-identified cells and tissues was used to predict gene expression. GNN predictions of TLS signature genes were compared with the predictions of models trained using hand-crafted, task-specific features (TLS feature models) describing the number, size, and cellular composition of identified TLS. The Pearson correlation coefficient was used to assess the accuracy of GNN and TLS feature model predictions. GNNExplainer1, a tool that simultaneously identifies a subgraph and a subset of node features important for predictions, was applied to interpret the GNN model predictions.

Results: GNN model predictions show reasonable accuracy: GNN models significantly predicted mRNA expression of all 12 genes (p<0.05), and the predicted expression of six genes was moderately correlated with ground-truth measurements (Pearson-r>0.5). The correlation of GNN predictions was higher than that of the TLS feature models for all 12 signature genes. The GNNExplainer identified relevant features including the mean and standard deviation of lymphocyte count, and fraction of lymphocytes in cancer stroma. Subgraphs selected by the GNNExplainer focus on, but extend beyond, regions of human-annotated TLS objects, indicating that TLS may influence gene expression and the TME in regions beyond their immediate vicinity.

Conclusion: Here, we show a comparison of two interpretable AI methods for the prediction of TLS-induced gene expression from WSI. The outperforming GNN-based approach is highly reproducible and accurate, predicting histopathology features relevant to TLS that may be used to inform patient prognosis and treatment. These methods could be applied to predict additional clinically relevant transcriptomic signatures. 1. ​​Ying, R, et al. 2019. arXiv:1903.03894v4

Citation Format: Ciyue Shen, Collin Schlager, Deepta Rajan, Maryam Pouryahya, Mary Lin, Victoria Mountain, Ilan Wapinski, Amaro Taylor-Weiner, Benjamin Glass, Robert Egger, Andrew Beck. Application of an interpretable graph neural network to predict gene expression signatures associated with tertiary lymphoid structures in histopathological images [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 1922.