Abstract
Background: The ability to produce timely, clinically actionable recommendations is a cornerstone of a robust precision oncology program. This requires the assimilation of rapidly changing precision oncology data that flows asynchronously from various EMR and lab sources. A key challenge is harmonization of omics biomarkers from hospital-based and commercial testing labs, rendering these omics data interpretable and comparable between patients, labs and reports. To address this challenge, we describe an automated system leveraging a hybrid ontology and graph structure to permit the rapid harmonization and interpretation of molecular diagnostic results across multiple labs.
Methods: A two-part, fully automated system to ingest and harmonize omics biomarker data from hospital-based and commercial testing labs was developed. First, an ontology-based omics lexicon provides annotation of discrete laboratory values provided electronically. The ontology model is Manchester language-based with data stored in a relational document store. Second, a graph structure serves as the knowledge model and relates the annotated lab values to 12 types of clinically informative biomarkers, including sequence and copy number variants, tumor mutational burden, MSI, gene rearrangements, wild type, and expression biomarkers. The graph structure leverages the ontology to fully describe hierarchical and categorical data associated with each biomarker. Access to the omics knowledge model uses a RESTful API and relational database architecture. Acquisition of molecular data uses custom JSON schemas and a RESTful API.
Results: Feeds from 7 commercial and hospital-based labs resulted in the receipt of >12,000 molecular diagnostic reports containing >2 million individual biomarker observations, with integration of each report achieved within 2 minutes of electronic data receipt. As of Nov. 2018, 81.2% of reports had at least one biomarker observation, while the remaining reports lacked results. Within these biomarker observations, 95.8% were represented in the ontology/graph structure, resulting in successful linkage to ~22,000 distinct harmonized biomarkers out of more than 12 million in the knowledge base. Of the linked, harmonized variant biomarkers, 4% are class-level (e.g. BRAF mutation) with the remainder reflecting more specificity (e.g. BRAF V600E). Within the set of more specific biomarkers, 54% represent a protein altering variant biomarker.
Conclusions: We have built an automated, scalable system using a novel ontology/graph structure to reliably and automatically harmonize and integrate omics data into a real-world oncology precision medicine platform. Proof of concept validation has been demonstrated in over 12,000 patient reports, with the results being used to enable clinical trials matching and outcomes research.
Citation Format: James E. Shima, James L. Chen, Matthew J. Glick, Ryan A. Warrier, Eric C. Abruzzese, Walter C. Mankowski, Jonathan Hirsch. A hybrid ontology and graph system enabling precision oncology applications through real-time incorporation of omics biomarkers into a scalable real-world data platform [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2019; 2019 Mar 29-Apr 3; Atlanta, GA. Philadelphia (PA): AACR; Cancer Res 2019;79(13 Suppl):Abstract nr 1676.