The St. Jude Cloud, an online resource for cancer researchers to download, upload, process, and visualize pediatric cancer data, has announced the release of the St. Jude Survivorship Portal, which contains a wealth of clinical and genomic data about survivors of childhood cancers.

Generation of large datasets, such as genomic and gene-expression datasets, has become increasingly common in cancer research. Although these datasets can yield important insights, their full value will not be realized without practical ways for researchers to view and analyze the information.

That challenge inspired the creation of the St. Jude Cloud, an online resource for sharing, visualizing, and analyzing large datasets about children with cancer and other serious illnesses. With the St. Jude Cloud and related cloud-based platforms, such as the NCI Cloud Resources, researchers can not only download and upload data, but also interactively explore and work with data within the platform, eliminating the need to store these large datasets on local hardware.

St. Jude Cloud houses more than 10,000 pediatric whole-genome sequences from a variety of large cohorts from patients with pediatric cancer, cancer survivors, and survivors of other catastrophic diseases. Importantly, clinical genomic data from cancer patients is released in real time to speed scientific progress and ensure timely discovery. Clinical phenotypic–genomic associations are also now possible via the St. Jude Survivorship Portal, unveiled at October's annual meeting of the American Society of Human Genetics. Here a critical link is provided between high-quality genomic data and associated clinical data from the St. Jude LIFE study, comprising information from nearly 6,000 patients treated for cancer and other serious conditions at St. Jude Children's Research Hospital (Memphis, TN) between 1962 and 2012 who returned to the institution for follow-up after 5 or more years.

Researchers can work in various ways with this and other datasets provided on St. Jude Cloud. For example, with GenomePaint, one of the powerful data-visualization tools on the platform, researchers can browse the genome for the most commonly mutated loci in a given cancer. Users don't need extensive knowledge of bioinformatics to start. “We want to make this data accessible to as many types of scientific researchers and clinicians as possible,” says Alex Gout, PhD, the lead scientist of the St. Jude Cloud project—and providing data solely as unwieldy, text-only files could deter interested investigators, especially those lacking the training or resources to process it.

Sophisticated data-processing tools are also available. For instance, the Pediatric Cancer Variant Pathogenicity Information Exchange (known as PeCanPIE), released earlier this year, allows researchers to submit single-nucleotide variant and indel data and receive an annotated list of the variants that can be explored using an interactive visual interface.

The aims of the St. Jude Cloud align with those of related resources, such as the NCI Cloud Resources, part of the Cancer Research Data Commons. As with the St. Jude Cloud's datasets, because of the large sizes of the files containing the genomic and proteomic data NCI offers, “it's not sensible to be downloading many of these large files and then storing them locally and trying to process them locally,” says Tanja Davidsen, PhD, a biomedical informatics specialist who helps manage the NCI Cloud Resources.

The NCI Cloud Resources can be viewed as complementary to the St. Jude Cloud, not as a competitor, says Jaime Guidry Auvil, PhD, director of the Office of Data Sharing at the NCI's Center for Biomedical Informatics and Information Technology. Researchers studying pediatric cancers may want to use the tools and datasets available through both institutions, particularly as each resource provides access to different data. “Moving forward,” says Dr. Guidry Auvil, “we'd love to create a more complete picture of pediatric cancers through connecting multiple pediatric data resources,” a sentiment echoed by Dr. Gout: “Hopefully through the recently announced Childhood Cancer Data Initiative we can begin to federate pediatric genomic data resources so we can achieve a greater understanding of pediatric cancer and how to develop more effective treatments in the future.” –Nicole Haloupek