While tumor-reactive CD4+ T cells have been associated with effective immunotherapy responses, accurate prediction of MHC class II-restricted ligands remains a challenge, limiting our ability to harness CD4+ immunity for cancer therapy. To this end, we have developed a state-of-the-art MHC class II binding predictor, neonmhc2, trained on HLA-II ligands identified from mass spectrometry (MS) of cell lines engineered to express a single affinity-tagged HLA-II allele. Over 60 HLA-II alleles have been characterized to date. We demonstrate that neonmhc2 outperforms NetMHCIIpan, the current benchmark for class II prediction, in distinguishing 1) HLA-II ligands presented in cell lines and tissues, and critically, 2) immunogenic CD4+ epitopes identified with tetramer-guided epitope mapping. Furthermore, we show that numerous neoantigen peptides that were ranked highly by neonmhc2 but poorly by NetMHCIIpan were immunogenic in an ex vivo induction. We next studied HLA-II antigen processing in order to further boost our ability to predict CD4+ epitopes. We determined a gene-level bias by comparing transcript expression and gene length to the frequency of observations in MS, finding that secreted genes are over-represented in HLA-II ligandomes from tissues. We built numerous primary sequence-based processability predictors trained on MS data, but only achieved significant prediction improvement when using the simple feature of determining if a candidate peptide contained sequence that overlapped a previously observed HLA-II ligand. By combining neonmhc2 binding prediction, transcript expression, gene bias, and the overlap feature in an integrated presentation predictor, we were able to achieve up to a 61-fold increase in ability to predict HLA-II peptides presented from tissue MS data over NetMHCIIpan alone. We also sought to understand which cells are presenting HLA-II ligands in the tumor microenvironment to elucidate the presentation pathway most relevant to immunotherapy. By leveraging publicly available RNA-seq (bulk and single cell), we found that professional antigen-presenting cells rather than tumor cells are primarily responsible for HLA-II presentation. We developed a novel SILAC-based MS workflow to directly interrogate peptides derived from phagocytosed tumor cells that are presented by dendritic cells. The experiment revealed that mitochondrial genes are preferentially presented from phagocytosed cells. In conclusion, by integrating proteomics and genomics data at large scale, we have defined new rules for understanding HLA-II processing and presentation, particularly in the context of the tumor microenvironment. This work should enhance our ability to predict CD4+ epitopes for immunotherapy.

Citation Format: Dewi Harjanto, Jennifer G. Abelin, Matthew Malloy, Prerna Suri, Tyler Colson, Scott P. Goulding, Amanda L. Creech, Lia R. Serrano, Gibbs Nasir, Yusuf Nasrullah, Christopher D. McGann, Diana Velez, Ying S. Ting, Asaf Poran, Daniel A. Rothenberg, Sagar Chhangawala, Alex Rubinsteyn, Jeff Hammerbacher, Richard B. Gaynor, Edward F. Fritsch, Rob C. Oslund, Dominik Barthelme, Terri A. Addona, Christina M. Arieta, Michael S. Rooney. Enhanced HLA-II epitope prediction for immunotherapy with novel proteomics and genomics approaches [abstract]. In: Proceedings of the AACR Special Conference on Tumor Immunology and Immunotherapy; 2019 Nov 17-20; Boston, MA. Philadelphia (PA): AACR; Cancer Immunol Res 2020;8(3 Suppl):Abstract nr B23.