Ovarian cancer is one of the leading killers in the US for women. The American Cancer Society (ACR) ranks ovarian cancer as the fifth leading cause of cancer deaths among women. This cancer is hereditary, with higher risks if the patient has more relatives who have had ovarian cancer before. The Cancer Genome Atlas (TCGA) is a joint-effort project by NCI and NHGRI that generates genomic data for the cancer patients and matches normal data samples with these patients. Missense mutations are single nucleotide polymorphisms (SNPs) in which the amino acids are changed. A missense variant could disrupt protein structures; thus the function of the disrupted protein is altered. In this study, we investigated genes with high expression levels in the TCGA RNA-seq dataset by characterizing protein structure change patterns of candidate genes due to missense mutations in association with ovarian cancer. We also analyzed the association of these genes with ovarian cancer prognosis. We aim to select genes that have pathogenic consequences. We selected 44 genes with the highest expression values based on TCGA ovarian cancer RNA-seq data. We then retrieved protein sequences from the Ensembl BioMart database and missense mutation annotation information from cBioPortal for 44 input genes. Specifically, the missense mutation dataset, which is projected by the Pan-Cancer Analysis of Whole Genomes, was retrieved from cBioPortal. We then used the Phyre2 database to predict the 3-D structures of the wild and mutated type genes to identify structural changes. However, some genes were not suitable for Phyre2 analysis. For example,RPL41 was not long enough for analysis with Phyre2 as it only had 25 amino acids whereas Phyre2 required 30. TMSL3 (TMSB4XP8) is a pseudogene not found in both NCBI and AlphaFold. We then searched DrugBank for 42 genes with the predicted protein structure changes by Phyre2 to identify drug targets. We also performed Kaplan-Meier survival analysis to further evaluate the clinical consequences of the candidate genes. Our preliminary results show 26 genes that have drug targets and several of them have been identified to either decrease or increase the survival rate when the expression is high in ovarian cancer patients. This indicates that the highly expressed genes with protein structure changes due to missense mutations could be potential biomarkers for treating ovarian cancer. The functional annotation study could further validate ovarian cancer association for our prioritized gene list. In the future, we also plan to compare prediction results between genders and race groups. Our results could guide medical researchers in prioritizing drug targets and developing better treatment strategies for ovarian cancer. Our approach in prioritizing candidate genes is applicable to other cancers as well, and we plan to develop bioinformatic pipelines for automatically predicting candidate genes with significant effect on patient survival for any cancer type.

Citation Format: Ian W. Hou, Yongsheng Bai. Computational identification of ovarian cancer candidate genes with mutated protein structures caused by missense variants. [abstract]. In: Proceedings of the AACR Special Conference: Precision Prevention, Early Detection, and Interception of Cancer; 2022 Nov 17-19; Austin, TX. Philadelphia (PA): AACR; Can Prev Res 2023;16(1 Suppl): Abstract nr P003.