Insights distilled from integrating multiple big-data or “omic” datasets have revealed functional hierarchies of molecular networks driving tumorigenesis and modifiers of treatment response. Identifying these novel key regulatory and dysregulated elements is now informing personalized medicine. Crucially, although there are many advantages to this approach, there are several key considerations to address. Here, we examine how this big data–led approach is impacting many diverse areas of cancer research, through review of the key presentations given at the Irish Association for Cancer Research Meeting and importantly how the results may be applied to positively affect patient outcomes. Cancer Res; 76(21); 6167–70. ©2016 AACR.

Recent advances in understanding tumor development and treatment response have evolved to use a combined analysis integrating multiple parallel “omic” datasets (genomics, proteomics, transcriptomics, microbiomics) generated from each tumor sample, producing functional hierarchies of molecular networks (1–4). The analysis of these functional hierarchies is enhanced by combining many of these networks into larger cohort-based (meta-) networks (targeting diseases, such as ovarian, colorectal, or breast cancer), creating unique insights into the mutational landscapes of cancers, the functional consequences in terms of gene expression, and dysregulation of the epigenome, such as in pediatric neuro-oncology (5). Generating and mapping each patient's unique omic profiles onto these meta-networks will facilitate truly personalized therapeutics. This synergistic approach has also opened exciting new treatment options and facilitated the development of advanced new molecular models of disease.

Key elements of producing these molecular networks involve the integration and interpretation of multiple large-scale datasets generated from the individual omic systems of each sample (such as the genome, proteome, and transcriptome). The term “big data” is used interchangeably here to cover three key aspects: (i) the size and volume of information collected that must be analyzed and shared; (ii) the numerous types of complex information generated and processed; and (iii) the speed at which these data are generated (6). Specifically, the immense size of these combined datasets requires new and evolving techniques to allow their analysis, revealing highly complex links in parallel processes. Once a network for each sample is created, multiple networks are combined to reveal common key processes and molecular elements (7). Importantly, this approach requires a significant investment in computational infrastructure and bioinformatics expertise, both of which are required to effectively manage, exploit, and analyze these extraordinarily large datasets. Here, we examine the application of this big data approach through the evaluation of several key reports presented at the 2016 Irish Association for Cancer Research Meeting.

A key element to improve our understanding of human cancer is an accurate characterization of the tumor microenvironment (TME). Current approaches to cancer therapeutics integrate the induction of tumor cytotoxicity with modulation of the TME. A key obstacle to this approach is stratification of the TME to inform these treatment strategies and reveal potential novel treatment options (8). To fully exploit this approach requires an in-depth understanding of the influence of genetic mutations driving the tumor and how the tumor location impacts the TME. Animal models are essential for research into the TME, but the further development of complex 3D human cell models will complement the animal studies. Novel biophysical and biomechanical approaches are required to produce these advanced, complex human 3D models, allowing them to support the in vitro 3D human TME in which malignant, hemopoietin, and mesenchymal cells will communicate, evolve, and grow.

Frances Balkwill (Barts Cancer Institute, Queen Mary University of London, London, United Kingdom) leads the CANBUILD project, focusing on a group of high-grade serous ovarian cancers that metastasize to the omentum, which are frequently found at disease presentation. The ultimate aim of CANBUILD is to construct this cancerous tissue in vitro using autologous cells. This European Research Council and Cancer Research UK–funded multidisciplinary project is responding to an urgent need for models that facilitate examination of the interaction between human immune cells and malignant cells from the same individual in an appropriate 3D biomechanical microenvironment.

A key element of their approach is “deconstruction” of the TME. This involves genomic, transcriptomic, and proteomic profiling of ovarian tumors. Using big-data techniques to integrate these profiles facilitates the production of a template for “reconstruction” (defining cell types, intra- and extracellular signaling pathways, genetic influences). This integrated dataset will be used to produce a model that can be refined on the basis of additional observations. Eventually, it is hoped the model will provide predictions that can be tested in vivo and ultimately influence clinical decisions. Using the data generated from the deconstruction stage, the group is reconstructing the tumor in silico. This facilitates multivariate analysis of relationships between the molecular features, genes and proteins, higher order structures, tissue biomechanics, tissue architecture, and cellularity (unpublished data). The reconstruction phase is currently testing three bioengineering approaches (functionalized PEG hydrogels, peptide amphiphiles, and a novel artificial omentum) to reconstruct a complex 3D TME in vitro (unpublished data).

It is widely appreciated that in addition to cancer cells, solid tumors contain infiltrating host cells and changes in the extracellular matrix (9). These infiltrating cells (including fibroblasts, endothelial cells, and immune cells) have been demonstrated to fundamentally alter tumor biology, modifying key clinical parameters, such as disease progression, response to therapy, and metastasis (10). Importantly, it is currently not well understood how these underlying mechanisms facilitate the ability of tumor cells to recruit and coopt these host cells and conversely, how these interactions modulate tumor cell function.

Multiple approaches have been used to address how tumor cells interact with local stromal cells, including imaging, small-molecule screening, transcriptional profiling, and proteomics analysis, using models ranging from complex in vivo to reductionist model systems (11–13). A major technical challenge in dissecting cellular signaling between tumor and host cells still remains where extracting proteins from solid tumors or multicellular in vitro models typically result in loss of cell-specific information of the signaling molecules of interest.

To understand cell-specific signaling in a multicellular context, Jørgensen and colleagues (Cancer Research UK Manchester Institute, The University of Manchester, Manchester, United Kingdom) have combined stable isotope labeling of individual cell populations with proteomics analysis (SILAC; refs. 13–16). Combining the SILAC approach with a global phospho-proteomics analysis and informatics analysis of regulated phospho-motifs allows prediction of pathways that are regulated, importantly identifying the regulation in a context-dependent manner (15, 16).

Through these data-integrative approaches, they are combining multiple datasets to describe how signals are processed in a cell-specific manner. The long-term outcome will be to understand how specific signals are processed to promote tumor progression and how blocking these signals can enhance therapeutic responses.

The power and insights gained by generating and combining omic data were demonstrated by John Greally (Albert Einstein School of Medicine, New York, NY), using two examples of virally induced neoplasia: hepatocellular carcinoma in patients infected with hepatitis C virus, and cervical epithelial neoplasia in women infected with human papilloma virus. These tumors distinctively allow preneoplastic or early neoplastic stages of development of carcinoma to be studied, the cirrhotic and inflamed liver in hepatocellular carcinoma and the cervical intraepithelial neoplasia stages preceding cervical carcinoma.

The hypothesis being tested was that epigenetic events create field defects predisposing to later mutational events that drive malignant transformation, using virally induced neoplasia with distinct neoplastic stages of development. Using genome-wide DNA methylation patterns, a common theme in both tumor types was demonstrated: an early acquisition of DNA methylation that is sustained in later stages of progression, with a late event of global loss of DNA methylation coincident with carcinomatous transformation (unpublished data).

Significantly, both cancer types were characterized by a subset of loci acquiring DNA methylation at known targets of polycomb repression. An immunohistochemical study revealed that the EZH2 component of polycomb is expressed in infected cervical epithelium, with increased expression correlating with neoplasia progression (unpublished data). This work provides insights into the potential mechanisms of each of these cancer types, a foundation for the development of predictive biomarkers, and the potential targeting of polycomb for preneoplastic chemoprevention.

Highlighted in these studies was the need to manage and analyze the data from large numbers of individuals. Significantly, these studies also highlighted the need to collect technical metadata (data describing the experimental genomic data generation) important for correct data analysis. They found that technical influences significantly affected the DNA methylation patterns observed and these needed to be removed before high confidence findings could be identified.

Professor John Greally described incorporating data from The Cancer Genome Atlas (TCGA), demonstrating the value of publicly available high-quality reference datasets. Using this integrated data, he described how epigenetic and transcriptional data can be combined, producing greater insights into the disease process, an integration approach that involves significant analytic and computational challenges.

Although there is an established symbiotic interaction between a host and their microbiota, it is only recently that clear evidence has emerged demonstrating the presence and role of microbiota in carcinogenesis (17–19). The use of genomic sequencing has helped to demonstrate additional nonhuman sequences present in many cancers, which were often filtered out prior to analysis. This has helped highlight the contribution of the microbiome to the transformed phenotype. However, the recognition of bacteria as a key element in many human cancers highlighted the need to specifically analyze these nonhuman sequences. Using high-throughput sequencing, followed by computational subtraction of human sequences, revealed microbiota in cancer samples. Dr. Susan Bullman (The Meyerson group, Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA) highlighted this approach. The development of computational approaches, such as PathSeq (20), allowed the isolation of nonhuman DNA sequences in deep-sequenced human disease tissue samples. Subsequently, cancer-associated bacteria (such as Helicobacter pylori) have been described in several cancer types (18, 21, 22). Recognizing the presence of bacteria in tumors has shed light on our understanding of cancer progression and importantly the mechanisms affecting how tumors respond to genotoxic therapeutics.

The TCGA has provided an unprecedented opportunity for sequencing-based pathogen discovery in cancer through the generation of large-scale sequencing data (up to 11,000 samples for approximately 33 human cancer types). Using an inventive computational approach, combined with the TCGA data, the Meyerson group profiled microbial signatures across more than 20 tumor types. Analyzing RNA sequencing (RNA-Seq) and/or whole-genome sequencing (WGS) data from more than 4,000 human tumor samples from TCGA cohorts using PathSeq allowed the identification of resident bacteria, viruses, fungi, bacteriophage, and archaea within each tumor or normal tissue specimen (see refs. 23–25 and unpublished data). This led to the identification of microbial species enriched in tumor tissue, compared with matched normal tissue using LDA effect size analysis (23). In addition, correlations between the abundance of specific microbes and host gene expression (RNA-Seq data), protein profiles (RPPA data), mutation signatures (whole-exome sequencing and WGS), molecular subtypes, and other clinicopathologic details can be used to further characterize the effects of bacteria present in tumors.

This approach identifies bacterial species that are enriched or depleted within human tumors. The application of this approach revealed an overabundance of Fusobacterium nucleatum in association with colorectal cancer (26). Further demonstrating the power of this approach, DNA sequencing of cord colitis samples revealed previously unknown, nonhuman, sequences. The assembly of these sequences produced a draft bacterial genome with a high degree of homology to the Bradyrhizobium genus of bacteria, with the new strain provisionally named Bradyrhizobium enterica (B. enterica). B. enterica nucleotide sequences were subsequently found in biopsy specimens from cord colitis patients, but not in healthy control samples (25). These approaches emphasized the importance of determining differences between cancer-associated and non-cancer–associated bacterial strains. Identifying different genomic features between strains and determining whether these changes are shared specifically between tumor isolates could potentially reveal “high-risk” strains.

Clearly, evaluating the tumor microbiome as a key component of the TME will provide novel insights into how pathogens contribute to tumorigenesis, affect treatment, and may ultimately lead to novel therapeutic targets.

Many simpler biological systems (individual datasets) have classically been studied using statistical methods, which fail to account for time-dependent changes in functionality or network topology (arrangement of the elements). One systems biology approach to integrate changes in functionally and network topology involves the application of ordinary differential equation (ODE), used to describe how quantities change over (continuous) time, for data mining. This has allowed systems biology mathematical models to begin revealing new disease-related insights.

Recent work by the Prehn group (Centre for Systems Medicine, Royal College of Surgeons in Ireland, Dublin, Ireland) has implicated deregulation of the apoptosis pathway with chemotherapy resistance. Their ODE-based modeling of apoptosis signaling has provided prognostic insights and predictive tools for colorectal cancer.

ODE-based mechanistic mathematical models were used to predict upstream apoptosis signaling controlled by the BCL-2 family, ultimately regulating mitochondrial permeabilization (27). The model developed by the Prehn group proved to be a potent systems–based prognostic tool for stage III colorectal cancer and has significant potential as a predictive tool for 5-FU–based chemotherapy in stage II colorectal cancer patients. The strength of this approach was demonstrated as the model predicted patient mortality independent of pathologic TNM staging and KRAS mutational status. However, predictions are strongly dependent on the previously described consensus molecular subtypes (1 and 3; ref. 28). The current work is exploring the potential of this approach, by applying the model as a predictive prognostic tool to stratify the response of colorectal cancer patients treated with BCL-2 antagonists.

This demonstrates the power of using systems-based modeling to assess complex protein–protein network interactions, producing clinically relevant prognostic tools.

The 2016 Irish Association for Cancer Research Meeting brought together some of the best international and Irish cancer researchers, highlighting the importance of big data–led approaches facilitating a more complete understanding of each tumor type and helping us understand how tumors are supported within the host. A key message delivered was that comprehensively omic profiling human tumors and the surrounding TME can reveal mechanisms of tumorigenesis (viral and bacterial), generate prognostic biomarkers, and identify potential therapeutic targets for personalized treatment.

The studies presented at this meeting highlight the need to capture datasets, generated by omic studies, describing the patient or sample. Facilitating big-data management is the use of high-powered cloud computing, allowing storage, analysis through increased computing power, and access (internal and external) of these large datasets (or storage in online repositories), relatively cheaply. Clearly, to exploit these combined datasets for cancer research requires accurate data management and new informative analysis processes. This will deliver on the promise of breakthroughs (fundamental and clinical) provided by in silico–based, system-wide, approaches to cancer analysis. This will allow more precise, individualized targeted therapeutic regimes to be developed and will ultimately improve patient outcomes.

SpeakerAffiliation
Prof. Frances Balkwill Barts Cancer Institute, Queen Mary University of London, United Kingdom 
Dr. Susan Bullman Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 
Prof. Gordon J. Freeman Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 
Dr. Susana Godinho Barts Cancer Institute, Queen Mary University of London, United Kingdom 
Prof. John Greally Albert Einstein College of Medicine, New York, NY 
Dr. Claus Jorgensen Cancer Research UK Manchester Institute, The University of Manchester, United Kingdom 
Prof. Diether Lambrechts VIB Vesalius Research Center, Leuven, Belgium 
Prof. Kingston Mills Trinity College Dublin, Dublin, Ireland 
Prof. Ciaran Morrison National University of Ireland Galway, Galway, Ireland 
Prof. Jochen Prehn Royal College of Surgeons, Dublin, Ireland 
Prof. Leonard Seymour Oxford University, Oxford, United Kingdom 
Dr. Mark Tangney University College Cork, Cork, Ireland 
SpeakerAffiliation
Prof. Frances Balkwill Barts Cancer Institute, Queen Mary University of London, United Kingdom 
Dr. Susan Bullman Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 
Prof. Gordon J. Freeman Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 
Dr. Susana Godinho Barts Cancer Institute, Queen Mary University of London, United Kingdom 
Prof. John Greally Albert Einstein College of Medicine, New York, NY 
Dr. Claus Jorgensen Cancer Research UK Manchester Institute, The University of Manchester, United Kingdom 
Prof. Diether Lambrechts VIB Vesalius Research Center, Leuven, Belgium 
Prof. Kingston Mills Trinity College Dublin, Dublin, Ireland 
Prof. Ciaran Morrison National University of Ireland Galway, Galway, Ireland 
Prof. Jochen Prehn Royal College of Surgeons, Dublin, Ireland 
Prof. Leonard Seymour Oxford University, Oxford, United Kingdom 
Dr. Mark Tangney University College Cork, Cork, Ireland 

No potential conflicts of interest were disclosed.

We would like to thank all speakers at the IACR meeting and apologies to those speakers whose work could not be included. We would like to acknowledge the contribution of the speakers discussed here in writing this report. We would like to acknowledge that this meeting was organized and run by all members of the IACR council and thank all council members for their efforts in planning and hosting a topical and well-received international conference. We would also like to take this opportunity to thank all the conference sponsors, whose support allows us to host a meeting with high caliber international speakers.

1.
Aviner
R
,
Shenoy
A
,
Elroy-Stein
O
,
Geiger
T
. 
Uncovering hidden layers of cell cycle regulation through integrative multi-omic analysis
.
PLoS Genet
2015
;
11
:
e1005554
.
2.
Franzosa
EA
,
Hsu
T
,
Sirota-Madi
A
,
Shafquat
A
,
Abu-Ali
G
,
Morgan
XC
, et al
Sequencing and beyond: integrating molecular ‘omics’ for microbial community profiling
.
Nat Rev Microbiol
2015
;
13
:
360
72
.
3.
Kulbe
H
,
Iorio
F
,
Chakravarty
P
,
Milagre
CS
,
Moore
R
,
Thompson
RG
, et al
Integrated transcriptomic and proteomic analysis identifies protein kinase CK2 as a key signaling node in an inflammatory cytokine network in ovarian cancer cells
.
Oncotarget
2016
;
7
:
15648
61
.
4.
Yugi
K
,
Kubota
H
,
Hatano
A
,
Kuroda
S
. 
Trans-Omics: how to econstruct biochemical networks across nultiple ‘omic’ layers
.
Trends Biotechnol
2016
;
34
:
276
90
.
5.
Northcott
PA
,
Pfister
SM
,
Jones
DT
. 
Next-generation (epi)genetic drivers of childhood brain tumours and the outlook for targeted therapies
.
Lancet Oncol
2015
;
16
:
e293
302
.
6.
Mattmann
C
. 
Computing: a vision for data science
.
Nature
2013
;
493
:
473
5
.
7.
Golden
A
,
Djorgovski
S
,
Greally
JM
. 
Astrogenomics: big data, old problems, old solutions?
Genome Biol
2013
;
14
:
129
.
8.
Crusz
SM
,
Balkwill
FR
. 
Inflammation and cancer: advances and new agents
.
Nat Rev Clin Oncol
2015
;
12
:
584
96
.
9.
Egeblad
M
,
Nakasone
ES
,
Werb
Z
. 
Tumors as organs: complex tissues that interface with the entire organism
.
Dev Cell
2010
;
18
:
884
901
.
10.
Hanahan
D
,
Weinberg
RA
. 
Hallmarks of cancer: the next generation
.
Cell
2011
;
144
:
646
74
.
11.
Avgustinova
A
,
Iravani
M
,
Robertson
D
,
Fearns
A
,
Gao
Q
,
Klingbeil
P
, et al
Tumour cell-derived Wnt7a recruits and activates fibroblasts to promote tumour aggressiveness
.
Nat Commun
2016
;
7
:
10305
.
12.
Hirata
E
,
Girotti
MR
,
Viros
A
,
Hooper
S
,
Spencer-Dene
B
,
Matsuda
M
, et al
Intravital imaging reveals how BRAF inhibition generates drug-tolerant microenvironments with high integrin β1/FAK signaling
.
Cancer Cell
2015
;
27
:
574
88
.
13.
Locard-Paulet
M
,
Lim
L
,
Veluscek
G
,
McMahon
K
,
Sinclair
J
,
van Weverwijk
A
, et al
Phosphoproteomic analysis of interacting tumor and endothelial cells identifies regulatory mechanisms of transendothelial migration
.
Sci Signal
2016
;
9
:
ra15
.
14.
Anton
KA
,
Sinclair
J
,
Ohoka
A
,
Kajita
M
,
Ishikawa
S
,
Benz
PM
, et al
PKA-regulated VASP phosphorylation promotes extrusion of transformed cells from the epithelium
.
J Cell Sci
2014
;
127
:
3425
33
.
15.
Jørgensen
C
,
Sherman
A
,
Chen
GI
,
Pasculescu
A
,
Poliakov
A
,
Hsiung
M
, et al
Cell-specific information processing in segregating populations of Eph receptor ephrin-expressing cells
.
Science
2009
;
326
:
1502
9
.
16.
Tape
CJ
,
Norrie
IC
,
Worboys
JD
,
Lim
L
,
Lauffenburger
DA
,
Jørgensen
C
. 
Cell-specific labeling enzymes for analysis of cell-cell communication in continuous co-culture
.
Mol Cell Proteomics
2014
;
13
:
1866
76
.
17.
Cho
I
,
Blaser
MJ
. 
The human microbiome: at the interface of health and disease
.
Nat Rev Genet
2012
;
13
:
260
70
.
18.
Schwabe
RF
,
Jobin
C
. 
The microbiome and cancer
.
Nat Rev Cancer
2013
;
13
:
800
12
.
19.
Thomas
RM
,
Jobin
C
. 
The microbiome and cancer: is the ‘oncobiome’ mirage real?
Trends Cancer
2015
;
1
:
24
35
.
20.
Kostic
AD
,
Ojesina
AI
,
Pedamallu
CS
,
Jung
J
,
Verhaak
RG
,
Getz
G
, et al
PathSeq: software to identify or discover microbes by deep sequencing of human tissue
.
Nat Biotechnol
2011
;
29
:
393
96
.
21.
de Martel
C
,
Ferlay
J
,
Franceschi
S
,
Vignat
J
,
Bray
F
,
Forman
D
, et al
Global burden of cancers attributable to infections in 2008: a review and synthetic analysis
.
Lancet Oncol
2012
;
13
:
607
15
.
22.
Faïs
T
,
Delmas
J
,
Cougnoux
A
,
Dalmasso
G
,
Bonnet
R
. 
Targeting colorectal cancer-associated bacteria: a new area of research for personalized treatments
.
Gut Microbes
2016 Mar 23
.
[Epub ahead of print]
.
23.
Segata
N
,
Izard
J
,
Waldron
L
,
Gevers
D
,
Miropolsky
L
,
Garrett
WS
, et al
Metagenomic biomarker discovery and explanation
.
Genome Biol
2011
;
12
:
R60
.
24.
Bhatt
AS
,
Freeman
SS
,
Herrera
AF
,
Pedamallu
CS
,
Gevers
D
,
Duke
F
, et al
Sequence-based discovery of Bradyrhizobium enterica in cord colitis syndrome
.
N Engl J Med
2013
;
369
:
517
28
.
25.
Kostic
AD
,
Gevers
D
,
Pedamallu
CS
,
Michaud
M
,
Duke
F
,
Earl
AM
, et al
Genomic analysis identifies association of Fusobacterium with colorectal carcinoma
.
Genome Res
2012
;
22
:
292
98
.
26.
Lindner
AU
,
Concannon
CG
,
Boukes
GJ
,
Cannon
MD
,
Llambi
F
,
Ryan
D
, et al
Systems analysis of BCL2 protein family interactions establishes a model to predict responses to chemotherapy
.
Cancer Res
2013
;
73
:
519
28
.
27.
Guinney
J
,
Dienstmann
R
,
Wang
X
,
de Reynies
A
,
Schlicker
A
,
Soneson
C
, et al
The consensus molecular subtypes of colorectal cancer
.
Nat Med
2015
;
21
:
1350
6
.
28.
Tape
CJ
,
Worboys
JD
,
Sinclair
J
,
Gourlay
R
,
Vogt
J
,
McMahon
KM
, et al
Reproducible automated phosphopeptide enrichment using magnetic TiO2 and Ti-IMAC
.
Anal Chem
2014
;
86
:
10296
302
.