Abstract
Background: The Cancer Genome Atlas (TCGA) project has performed molecular profiling of human tumors using genomic, epigenomic, transcriptomic, and proteomic platforms, and each tumor is comprehensively characterized by around 100,000 molecular attributes in addition to typical clinical attributes. To make these data directly available to the entire cancer research community, several data portals have been developed. However, none of the existing data portals allow systematic exploration and interpretation of the complex relationships between the vast amount of clinical and molecular attributes.
Methods: We developed LinkedOmics (http://www.linkedomics.org), a web platform that focuses on the discovery and interpretation of associations between clinical and molecular attributes. LinkedOmics includes three data analysis modules. The LinkFinder module allows flexible exploration of associations between a molecular or clinical attribute of interest and all other attributes, providing the opportunity to analyze and visualize associations between billions of attribute pairs for each cancer cohort. The LinkCompare module enables easy comparison of the associations identified by LinkFinder, which is particularly useful in multi-omics and pan-cancer analyses. The LinkInterpreter module transforms identified associations into biological understanding through pathway and network analysis. All modules provide user-friendly data visualization.
Results: The current version of LinkedOmics contains multi-omics data and clinical data for 32 cancer types and a total of 11,158 patients from the TCGA project. It is also the first multi-omics database that integrates mass spectrometry (MS)-based global proteomics data generated by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) on selected TCGA tumor samples. In total, LinkedOmics has more than a billion data points. We used several case studies to demonstrate the utility of LinkedOmics in revealing functional impact of somatic mutation or copy number alteration on mRNA or protein expression, in deriving multi-omics based protein signature for poor prognosis, in performing pan-cancer analysis to identify survival-associated gene expression signature, and in connecting novel pan-cancer poor prognosis markers to tumor invasiveness and aggressiveness.
Conclusions: LinkedOmics provides a unique platform for biologists and clinicians to access, analyze and compare cancer multi-omics data within and across tumor types. With 5 case studies we demonstrated the power of LinkedOmics in cancer research. Although the current version of LinkedOmics includes only TCGA and CPTAC data, it can be easily extended to support other cohort-based or user provided multi-omics studies.
Citation Format: Suhas Vasaikar, Pater Straub, Jing Wang, Bing Zhang. LinkedOmics: Analyzing multi-omics data within and across 32 cancer types [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2018; 2018 Apr 14-18; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2018;78(13 Suppl):Abstract nr 2295.