Abstract
Brain tumors are the leading cause of disease-related death in children and young adults ages 0-19 in largely populated countries such as the United States. In one year alone, 4,000 children and young adults will be diagnosed with a brain or central nervous system tumor in the United States. Brain tumors are complex and difficult to treat in growing children, with current treatments oftentimes causing significant and lifelong side effects. Furthermore, there have only been five drugs in the last 20 years approved by the FDA to treat pediatric brain tumors. Founded in 2011, the Children’s Brain Tumor Network (CBTN) is focused on accelerating the pace of translational research, the discovery of new treatments, and informing precision medicine for children diagnosed with brain tumors. CBTN comprises 32 member institutions/hospitals having over 4700 patient subjects enrolled, spanning 30+ brain tumor diagnoses, over 66,000 biobanked samples and 150 preclinical models. Longitudinal clinical data is also collected for every subject that is enrolled in the observational protocol.
Through large scale data generation efforts funded by the NCI and foundational support, CBTN has whole genome, RNA-seq and other molecular characterization for over half of the enrolled patient population. Additionally, efforts have been underway to collect all pathology and radiology imaging and reporting for the subjects. With sequencing being done by multiple vendors, imaging protocols being different across multiple hospitals, and complex clinical treatment and longitudinal follow up data being translated from EHR systems, CBTN has created a rich, but complex, data landscape that is the largest of its kind in the world. In order to accelerate the process of going from data to cures, the data needs to be centralized, organized, and easily distributable. To do this, CBTN has built a first of its kind data workflow that acts as the inventory system for its various data assets. Using a modern data stack including dbt, PostgresSQL, Meltano and AWS, combined with utilization of FHIR as an interchange standard, data from multiple disparate sources such as REDcap, EHR systems, and PAC systems flow in near “real-time” to be utilized as integrated data resources
The result of this modern, multimodal, and multi-institutional warehouse allows CBTN to distribute data quickly and accurately to translational researchers around the world and contribute data to key research efforts such as AACR Project Genie, Kids First Data Resource Center, NCI Childhood Cancer Data Initiative, and the NCI’s Open Targets Platform.
Citation Format: Bailey K. Farrow, Nicholas Van Kuren, Nathan Young, Christopher Friedman, Meen Chul Kim, Alex Lubneuski, Jennifer Mason, Thinh (Bin) Nguyen, Zeinab Helili, Elizabeth Frenkel, Catherine Sullivan, Ariana Familiar, Yuankun Zhu, Mateusz Koptyra, Tatiana Patton, Jena Lilly, Phillip B. Storm, Adam Resnick, Allison P. Heath. Establishing a multimodal data warehousing platform to accelerate discoveries in pediatric brain tumors for the Children’s Brain Tumor Network. [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 3565.