Substantial effort in recent years has been devoted to constructing and analyzing large-scale gene and protein networks on the basis of “omic” data and literature mining. These interaction graphs provide valuable insight into the topologies of complex biological networks but are rarely context specific and cannot be used to predict the responses of cell signaling proteins to specific ligands or drugs. Conversely, traditional approaches to analyzing cell signaling are narrow in scope and cannot easily make use of network-level data. Here, we combine network analysis and functional experimentation by using a hybrid approach in which graphs are converted into simple mathematical models that can be trained against biochemical data. Specifically, we created Boolean logic models of immediate-early signaling in liver cells by training a literature-based prior knowledge network against biochemical data obtained from primary human hepatocytes and 4 hepatocellular carcinoma cell lines exposed to combinations of cytokines and small-molecule kinase inhibitors. Distinct families of models were recovered for each cell type, and these families clustered topologically into normal and diseased sets. Cancer Res; 71(16); 5400–11. ©2011 AACR.

Major Findings

Clustering arises from systematic differences in signaling logic in three regions of the network. We also infer the existence of a new interaction involving Jak-Stat and NFkB signaling and show that it arises from the polypharmacology of an IkB kinase inhibitor rather than previously unidentified protein–protein associations. These results constitute a proof-of-principle that receptor-mediated signal transduction can be reverse engineered by using biochemical data so that the immediate effects of drugs on normal and diseased cells can be studied in a systematic manner.

We start with a prior knowledge network (PKN) comprising a “signed and directed” node-edge graph that depicts interactions among proteins as arrows so that substrate–product and activation–inhibition relationships are captured: Raf→MEK→ERK for example. The 78-node, 112-edge PKN in this article came from the Ingenuity database with manual additions. A PKN cannot be compared directly with cell response data because a model is needed to specify input–output relationships. Here, we use a simple Boolean formalism, in which the state of a node is either 0 or 1 (“off” or “on”), and nodes interact via AND/OR/NOT logical operators. In this formalism, if epidermal growth factor receptor OR IR is active, then MAP/ERK kinase (MEK) is active, etc. Conversion of a PKN into a Boolean model proceeds as follows. Network preprocessing specifies proteins to be measured or perturbed experimentally (so-called designated nodes) and then compresses the PKN on the basis of 2 criteria (ref. 1; a) nodes that are not directly or indirectly connected to designated nodes are eliminated (because we have no data on them; b) cascades in which a undesignated nodes impinge on designated nodes are simplified; for example, if Raf and ERK are designated, but MEK is not, then Raf→MEK→ERK is replaced by Raf→ERK. Logical expansion computes all possible combinations of logic gates compatible with the compressed network. Each interaction in the PKN (hyperedges in graph theory) can give rise to multiple logical connections, so that if 2 edges link into ERK (e.g., Raf→ERK and NFκB→ERK), we would generate 3 logic interactions: (i) Raf→ERK, (ii), NFκB→ERK, and (iii) Raf AND NFκB→ERK. An OR gate (Raf OR NFκB→ERK) simply corresponds to (i) plus (ii). In our compressed PKN, 32 nodes and 128 logical interactions (hyperedges) give rise to 2128 = ∼1038 possible models, each of which is evaluated against data. Training a family of possible models against data involves propagating the input signals along the logical network until all nodes reach interconsistent values (a logical steady state); models are then compared with experimental measurements. Raw data are processed so that arbitrary intensity measures from xMAP sandwich immunoassays are converted into values between 0 and 1 on the basis of various standards, a procedure that we have previously described in detail (1). For quantitative analysis of model/data match, the deviation [mean squared error (MSE)] between data and a specific model is computed as |${\rm MSE} = \frac{1}{{n_E }}\sum\limits_{k = 1}^s {\sum\limits_{l = 1}^m {\sum\limits_{t = 1}^n {\left( {B_{k,l,t}^M - B_{k,l,t}^E } \right)^2 } } }$|⁠, in which |$B_{k,l,t}^M \in \left\{ {0,\,1} \right\}$| is predicted by the logical steady state of the model and |$B_{k,l,t}^E \in \left[ {0,\,1} \right)$| is the discretized data for assay l recorded at time t under the kth experimental condition. Values incompatible with a logical steady state are penalized as though they represent a mismatch between simulation and experimental data. We seek the simplest models consistent with data, using a bipartite objective function: Θ = MSE + α ΘS, in which ΘS is model size and α is an adjustable parameter. We have shown that models identified by using this objective function are similar in their numbers of edges and goodness of fit across a wide range of values for α (1); in this article, we set α = 0.0001.

The training procedure consists of searching across 2128 models by the objective function by using a standard genetic algorithm (2). Model training was iterated 50 to 100 times for each set of data. Because of the stochastic nature of the genetic algorithm and the nonidentifiability of the models given data, different solutions were recovered each time. In common with most work on network inference, we addressed nonidentifiability by analyzing families of models instead of single solutions; in this work, models within 1% MSE of the best-fit model constituted the “consensus.” For subsequent analysis, the distance between 2 sets of models j and k was computed as |$d_{j,k}^2 = \sum\limits_{i = 1}^n {\left( {f_i^j - f_i^k } \right)^2 }$|⁠, in which |$f_i^k$| is the frequency of the ith hyperedge in the kth model set. Distances were normalized with respect to the distance between hepatocyte and AvgHCC models (Supplementary Table S1).

The availability of high-throughput interaction data has led to the creation of methods for summarizing and exploring networks by using node–edge graphs. In these graphs, genes or proteins are represented by nodes (vertexes) and interactions by edges (3, 4). The underlying interaction data are diverse and include manual or automated text mining of the literature (5, 6), genetic interactions obtained from gene deletion sets, and physical interactions identified by large-scale mass spectrometry or 2-hybrid analysis (4, 7). Interactions in node–edge graphs can be undirected (denoting an unspecified interaction), directed but unsigned (denoting substrate–product relationships), or directed and signed (denoting both substrate–product and inhibition–activation relationships); the latter are particularly useful because they capture biochemical causality. For protein data, graphs comprising undirected edges are typically called protein interaction networks (PIN), whereas those with signed directed edges are known as protein signaling networks (PSN). Most work on PINs and PSNs to date has focused on adding as much data as possible, often from more than one organism or type of experiment, so as to construct large networks with the greatest possible scope and the greatest number of interactions per node (increasing the “degree” of the network); the culmination of this effort is a proposed “Human Interactome” covering all known gene products (8).

In cancer biology, comparative analysis is the natural focus of “conventional” low-throughput studies of signal transduction, with particular attention paid to differences in cellular responses to ligands or drugs in different cell types. In most cases, these differences reflect changes in the abundance or activity of signaling proteins (or of their substrates), features that could in principle be depicted by the strength of an edge in a network graph. However, existing PSNs and PINs do not encode the activities of proteins in cells that have been exposed to specific activators or inhibitors. A dearth of data on context-specific interactions makes it difficult to compare normal and diseased cells, or diseased cells from different tumors. Cell- and state-specific information has been added to network graphs using gene expression data (3, 7, 9), but few attempts have been made to reconstruct comparative networks using biochemical data.

In this article, we attempt to combine concepts from global network discovery and traditional biochemistry by constructing comparative network models of signal transduction in normal and transformed liver cells. Starting with a prototypical network derived from the literature [which we will refer to as a prior knowledge network (PKN)], we first constructed a set of all Boolean models compatible with the PKN, used the model “superstructure” to guide the collection of biochemical data on multiple nodes in the network across multiple cell types, and then trained the superstructure against data to uncover underlying differences in signaling logic among cell types. The net result is a computational representation of a signaling network that focuses on activity rather than literature association or physical interaction and that is explicitly comparative.

A first essential step in adding activity data to networks is to convert PKNs into models in which it is possible to compute input–output (I/O) characteristics (1). In this article, we use a 2-state (Boolean) logical formalism in which each node can have only two states, 0 or 1, but having a 1 at the output can depend on having a 1 at one of several inputs (an OR gate), all inputs (an AND gate), or 0 and 1 inputs in any combination. Boolean models have the advantage that they have no continuous free parameters and their topologies can be trained efficiently using data (1), a task that is harder with large differential equation models (10). However, we recognize that real biological systems exhibit dose response behavior that is only poorly approximated by Boolean logic. Thus, a major question at the outset of this work was whether the strengths of Boolean modeling with respect to computational simplicity would outweigh its weaknesses. It seemed possible that the crudeness of the Boolean on/off approximation would overwhelm any differences we might measure experimentally from one cell type to the next. Conversely, success in creating comparative models would constitute a proof-of-principle for the approach.

We therefore applied Boolean modeling to distinguishing patterns of immediate-early signaling in normal and transformed cells, represented here by primary human hepatocytes and HepG2, Hep3B, Focus, and Huh7 liver cancer cell lines. Liver cancer (which is dominated by hepatocellular carcinoma [HCC]) is the third most common cause of cancer death in humans (11) and is known to involve alterations in the EGF-Ras-MAPK, AKT-mTOR, Jak-Stat, and NF-κB cascades (12). Thus, we aimed to collect multivariate data on the activities of these pathways in normal and transformed hepatocytes. We show that it is possible to assemble predictive network models that are specific to each cell type, cluster models based on topology, and uncover consistent biochemical differences between transformed and normal cells. By identifying an interaction missing from the starting PKN, but supported by data, we also uncover a poorly documented off-target effect of a drug being developed for asthma and inflammation (13, 14). Our findings show that discrete logical modeling can capture cell-type specific biochemical relationships, raising the possibility of constructing large comparative models of signal transduction in normal and diseased cells.

Data generation

HepG2 and Hep3B, HuH7, and Focus cells were plated in 96-well plates coated with collagen type I (Becton Dickinson) with 100 μL phenol-free Williams' Medium E (WEM; Sigma-Aldrich) with supplements (15). Freshly preplated primary human hepatocytes were purchased from CellzDirect and cultured overnight on collagen, starved for 6 hours in 95 μL of WEM, exposed to kinase inhibitors for 40 minutes, and then to ligands for 25 minutes. All cells were lysed in 90 μL of manufacturer's xMAP buffer (Bio-Rad) and intracellular signals measured by using a Luminex Instrument (Luminex) and 16-plex phosphoprotein bead sets (Bio-Rad; see Supplementary Material). Different dilutions of cell extract were required for all 16 signals to be in the linear range.

HepG2 and Hep3B cell lines were obtained directly from American Type Culture Collection (ATCC); Huh7 and Focus cells (16), which are not available from ATCC, were obtained from Prof. Jack Wands (Brown University, Providence, RI). Cells were expanded once to create master stocks from which working cultures were periodically established; no lines were serially passaged longer than 3 weeks. ATCC cell lines are validated by the provider (17); no testing was carried out on the other lines. Freshly plated human hepatocytes were tested for contaminants as described in ref. 15.

Reagents

To minimize experimental variability, samples were processed in parallel and common stocks of cytokines, inhibitors, and assay reagents used throughout (see Supplementary Materials). Recombinant Jak2 and IKK-2 obtained from Cell Signaling Technology were preincubated with kinase inhibitors for 5 minutes in manufacturer's buffer with 20 μmol/L ATP and FLT3 as Jak2 substrate or Iκb as a IKK-2 substrate. Phosphorylation was assayed by using a time-resolved fluorescence plate reader (PerkinElmer Victor 3) and apparent IC50 calculated from the activity A, in which A = (I + [drug]/IC50H)−1 and H is the apparent Hill coefficient. Ki for TPCA-1 was calculated by using the Cheng–Prusoff equation, Ki = IC50/(1 + [ATP]/KM), in which KM (for ATP) was 0.43 μmol/L for Jak2 and 0.91 μmol/L IKK-2.

Data handling

Data were processed and visualized using DataRail software (18, 19), with xMAP measurements normalized to a value between 0 and 1 (1).

Network construction and model calibration

The starting PKN was constructed with ProMoT (20) using the database of Ingenuity Systems (21) supplemented manually. Modeling was then done using our MATLAB toolbox CellNetOptimizer(CellNOpt; ref. 22).

The dynamics of immediate-early signaling pathways were probed by using a combinatorial experimental protocol (23): primary human hepatocytes and 4 HCC lines were treated with Interleukin (IL)-1α, IL-6, TGF-α, TNF-α, and insulin (15), in the presence and absence of small-molecule kinase inhibitors of IKK, MAP/ERK kinase (MEK), and phosphoinositide 3-kinase (P13K), and 16 intracellular signaling proteins were then assayed in cell extracts using multiplex sandwich immunoassays (xMAP assays; Luminex Inc. Fig. 1 and Supplementary Fig. S1; ref. 15). Our use of kinase inhibitors and phosphoprotein antibodies naturally focused the analysis on the druggable kinome (24), but similar analysis with other classes of drugs and signaling proteins is also possible. The experiments generated 5 sets of ligand-response data (for HepG2, Focus, Hep3B, and Huh7 cell lines; “hepatocytes” refers to primary human hepatocytes); a sixth dataset (AvgHCC) was synthesized by arithmetically averaging data from the 4 tumor lines and attempts to capture biochemistry common to all cell lines.

Because Boolean models lack continuous parameters (akin to the rate constants in an differential equation network), it is not necessarily true that training will yield a model having a substantially better fit to data than the PKN, but this was the case with our data and models: The untrained ensemble containing all possible interactions and logic exhibited a poor fit (39%–47% MSE error across all datasets, Fig. 3), whereas families of trained models exhibited much better fit (9%–13% MSE error, see Fig. 3 and Supplementary Fig. S2). We carried out cross-validation and statistical tests to show that trained models were predictive of real data and were nonrandom (Supplementary Figs. S3–5). Moreover, models trained against individual HCC cell lines were all better than the starting ensemble at predicting AvgHCC data (which was not used in training; 10%–14% MSE error, Fig. 3A). When the fit between models and data corresponding to individual ligands or biochemical assays was examined, levels of error were relatively low except in the case of p53 and IRS1s, which exhibited poor fits across all conditions (Fig. 3B and Supplementary Fig. S6). This almost certainly arises because the PKN represents p53 biology in an imprecise manner and annotation of IRS1 modification is incomplete. These are areas in which improved PKNs would clearly be useful. Nonetheless, we conclude that model training recovers substantially better network representations than the starting PKN.

Signaling network properties determined from data-trained logical models

In our procedure, training a PKN-based Boolean model against data improves the goodness of fit by removing unused edges. However, connectivity varied significantly with cell type: 85 of 128 possible gates were present in the superposition of all models in all cell types (involving ∼90% of the interactions not removed in the preprocessing), but only 7 gates were common to all models (Fig. 4 and Supplementary Fig. S7). We therefore concluded that the primary deficiency of the literature-derived PKN with respect to our data is not the presence of true false-positive interactions (because some support can be found for most edges in data) but rather an absence of cell-type specificity.

To compare the topologies of models for all 6 datasets, we computed pairwise distances by enumerating edges that differed between averaged best-fit models (see Materials and Methods and Supplementary Table S1 for details). HCC-derived models clustered together, with models built from AvgHCC data in the middle of the cluster, and well separated from models of primary hepatocytes (Fig. 5A). Models of Focus cells were farthest from primary cell models and HepG2 models were closest, in agreement with a classification of HCC lines proposed previously (25–27; Fig. 5B). Moreover, the pattern of clustering derived from network topology was generally similar to the pattern computed from transcriptional profiles. Although the goal of logical modeling is not to generate cluster diagrams (being focused instead on the biochemistry of signal transduction), the similarities between clusters generated using transcript profiling and logical models suggest that the biochemical processes covered in our networks are representative of broader differences across cell lines.

Pathway differences in signaling networks among primary and transformed cells

Next, we asked which interactions or logical gates were consistently present or absent when all possible models for one cell type were compared with all models of another cell type. This is a conservative approach that accounts for the inability of training to uniquely specify a model for each cell type on the basis of available data. We observed that 1 interaction was absent from all models of HCC cells although being present in all models of primary hepatocytes, whereas 6 interactions had the opposite property, being present only in models of HCC cells (Fig. 6). These differential interactions affected 3 regions of the signaling network. First, whereas the epidermal growth factor receptor (EGFR) ligand TGF-α caused ERK activation in all cell types, upregulation of Hsp27-S78 phosphorylation (presumably by PRAK kinase, which lies downstream of ERK) was observed only in primary cells (differential interactions 1–2). In 2 of 4 HCC cell lines (Focus and Huh7), Hsp27-S78 was phosphorylated to a significant degree, but it was p38- rather than ERK dependent (Supplementary Fig. S7). These findings are consistent with a reported association between low levels of Hsp27-S78 phosphorylation and tumor progression in HCC; our data show that the situation is more complex than previously thought (28), potentially involving a switch in Hsp27 kinases. A second significant difference between primary and HCC cells involved a change in the inferred logic of the IKK–NF-κB pathway: In primary hepatocytes Iκb-S32/S36 phosphorylation (a signal for Iκb degradation and consequent nuclear localization of NF-κB) required TNF-α and an activator of PI3K such as TGF-α (differential interactions 3–4). In contrast, in HCC lines only TNF-α was required, implying differential control over canonical NF-κB–mediated signaling.

The third significant difference involved phosphorylation of PI3K/AKT and GSK3-S9/S21 in insulin-treated HCC cells but not in primary hepatocytes (differential interactions 5–8). AKT is a potent prosurvival kinase, and its phosphorylation of GSK3 on S9/S21 is known to downregulate GSK3 activity and promote nuclear localization of β-catenin, NFAT, and other progrowth factors (29). Insulin receptor (IR) substrate 1 (IRS1) is overexpressed in HCC cell lines (30), and it is possible that this shifts IR signaling (or signaling by insulin-like growth factor receptors, which are also present in these cells) from a metabolic function in normal liver (31) to a prosurvival function in tumors that involves elevated PI3K/AKT and GSK3 phosphorylation. Increased AKT activity is also expected in tumors due to downregulation of proteins such as the p85 subunit of PI3K, a common feature of HCC (32).

Overall, we conclude that direct comparison of Boolean models was successful in identifying activity-dependent differences among normal and diseased cells that are hard to capture in PINs and PKNs assembled from physical interaction or steady-state proteomic and expression data. At the same time, we note that the Boolean models in this article capture signaling at a single time point only, and only describe ligand-driven changes in phosphorylation. The absence of an IR→PI3K link in the inferred map for hepatocyte signaling (despite the known function of IR in the liver; ref. 32) might arise because basal levels of AKT phosphorylation are high in these cells, making it difficult to detect ligand-dependent superactivation, or because we assayed cells at the wrong point in time (methods for incorporating time-series data into calibrated Boolean models are in development). However, Boolean modeling correctly inferred an EFGR→PI3K link in both transformed and primary cells and follow-up experiments suggest that there is indeed a greater propensity of tumor cells to respond to insulin by activating AKT.

Identification and testing of model-inferred novel pathway interactions

The model training described above focused on eliminating interactions present in the PKN but not supported by data. However, it is likely that PKNs also lack interactions that are supported by data. Indeed, we observed that the single largest error in all models with respect to data involved an observed inhibition of Stat3-Y705 phosphorylation by the IKK inhibitor TPCA-1 under conditions of IL-6 stimulation (Fig. 3B). TPCA-1 is reported to be a potent and selective inhibitor of human Iκbkinase-2 (IC50 = 18 nmol/L for IKK-2 and 400 nm for IKK-1) and was originally identified by GlaxoSmithKline in a drug discovery effort focused on rheumatoid arthritis and airway inflammation (13, 14). The inhibition of Stat3-Y705 phosphorylation by TPCA-1 can be explained in a Boolean model by adding an interaction between IKK and Stat3 (see Fig. 4); this reduced the MSE error of all model families by approximately 5%. Whereas an IKK→Stat3 edge might represent a physical interaction, an alternative explanation is that prior knowledge about TPCA-1 is incorrect and the drug actually inhibits Jak2, the known kinase for Stat3. To distinguish among these possibilities, we carried out in vitro activity assays of purified Jak2 and IKK-2 in the presence of TPCA-1 or BMS345541, an IKK-2 inhibitor having a distinct chemical structure (IC50 ∼300 nmol/L). BMS345541 does not compete with ATP (it binds to homologous allosteric sites on IKK-1 and IKK-2) and presumably has different off-target effects. We found TPCA-1 to be nearly as potent an inhibitor of Jak2 in vitro (Ki ∼9 nmol/L) as of IKK-2 (Ki∼1.6 nmol/L), its known target, but BMS345541 was IKK selective. Moreover, in IL-6–stimulated cells, BMS345541 reduced phosphorylation of the IKK substrate Iκbα on Ser32/Ser36 but had no detectable effect on the level of phosphorylated Stat3-Y705 (Fig. 7). We conclude that Jak2 is a target of TPCA-1 (consistent with a recent kinome profile; ref. 33), and that Boolean network inference therefore identified a new target for the drug rather than a new protein–protein interaction.

Despite the relative crudeness of 2-state logical approximations of biochemical reactions, this article shows that it is possible to use Boolean modeling in combination with high-throughput cell-response data to automate discovery of biochemical differences in signal transduction among tumor and normal cell types. Applying the approach to primary human hepatocytes and 4 HCC cell lines revealed consistent differences in the apparent logic and activities of growth factor receptor and intracellular kinase cascade in response to different ligands. Among the inferred differences between normal and transformed cells are several involving the strength or logic of signaling among IR, PI3K, AKT, and NF-κB, all molecules that have been implicated in the development of HCC. An unexpected pharmacologic insight was the identification of Jak-Stat signaling as a target for TPCA-1, an IκB kinase inhibitor developed to treat arthritis and airway inflammation. Detecting this polypharmacology required comparison of a computable network model against data across a landscape of treatment conditions, thereby allowing multivariate effects to be linked to specific causes. Intriguingly, TPCA-1 is significantly more potent than other IKK inhibitors in assays for airway inflammation. Both Jak2/Stat3 and IKK/NF-κB play a role in inflammation (13, 14) and TPCA-1 would therefore seem to be a “dirty” drug that is superior to a drug that binds specifically to the nominal target (34). More generally, the approach to modeling described in this article may constitute a general means to study polypharmacology that is complementary to methods for investigating drug mechanism on the basis of transcriptional data and protein interaction networks (35).

Our method focuses on eliminating interactions in the PKNs that do not fit data. Because the number of potential edges in an approximately 80-node network exceeds 1040, it is currently impossible to conduct a comprehensive search for new edges that improve the fit to data. However, in the current work simple inspection sufficed to identify a potential AND-gated edge connecting IKK→Stat3 that was absent from the PKN. Implementing a rigorous approach to finding new edges will require efficient means to search models locally or to make more intelligent use of prior knowledge. Alternatively, a variety of network–inference methods other than Boolean logic are likely to be effective for analyzing biochemical data, including differential equations (36–38), Bayesian networks (39), and fuzzy logic (40). Continued development of CellNOpt on our part, as well as the creation of alternative modeling approaches fostered by efforts such as DREAM (41), is likely to improve the efficiency and accuracy of network inference from biochemical data. Moreover, improvements in affinity capture assays, protein arrays (42), and mass spectrometry (43, 44) should make it possible to collect much more extensive training data than we report here.

Deep sequencing and other approaches to genome analysis support the idea that it is networks and pathways rather than individual genes that are the functional objects of oncogenic mutation and selection in human cancer (45). However, approaches for constructing large-scale protein interaction graphs (4–6) are rarely cell-type specific, and traditional “bottom-up” experiments that focus on cellular responses rather than network topologies are effective at uncovering such differences, but they cannot easily incorporate network information in a formal way. Comparative network inference is needed to close the gap, but current efforts focus on gene regulatory networks, in large part because available datasets involve expression signatures, gene sequences, and transcription factor binding sites (9, 46). Our results with Boolean logic and biochemical data constitute an encouraging proof-of-principle that the biochemistry of signaling networks, including the states and activities of proteins important in modern drug discovery, can be also be inferred and studied systematically. Because the product of our analysis is a computable model, it is amenable to continuous improvement and extension (with new data and interactions, for example) in much the same way as networks inferred from genome data. In contrast, it is difficult to account for new data by using conventional, informal descriptions.

However, comparison of contemporary approaches to studying gene regulatory network in cancer (e.g., Carro and colleagues; ref. 9) with our work serves to illustrate a fundamental difference between genome-scale data and “high throughput” biochemistry. Genomic datasets tend to contain many data points, and identifying regulatory interactions and biologically meaningful covariation is a major challenge. In contrast, even “systematic” sets of biochemical data are much smaller. However, biochemical data are also preselected to contain relevant signaling information and the primary challenge is creating a framework that is effective at modeling relatively sparse data from multiple cell types. The rather narrow purview of biochemical models also makes it likely that many important responses are missed because they are not measured. The future clearly lies in creating hybrid models that fuse biochemical data on immediate-early signaling with data on sequence variation, gene expression, and transcription factor binding. Such models would provide unique insight into mechanisms of oncogenic transformation and would have practical uses in drug discovery and tumor classification. For example, it has been shown that generic PINs and PKNs represent valuable prior knowledge for tumor classification from transcriptional profiles (47), and it seems likely that more accurate tumor-specific logical models or hybrid “interactomes” would prove even more useful in this context.

No potential conflicts of interest were disclosed.

We thank W. Chen, M. Niepel, J. Muhlich, S. Milton, S. Mirschel, and members of Pfizer RTC for support and useful discussions.

This work was supported by NIH grants GM68762 and CA112967.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Saez-Rodriguez
J
,
Alexopoulos
LG
,
Epperlein
J
,
Samaga
R
,
Lauffenburger
DA
,
Klamt
S
, et al
Discrete logic modelling as a means to link protein signalling networks with functional analysis of mammalian signal transduction
.
Mol Syst Biol
2009
;
5
:
331
.
2.
Goldberg
D
. 
Genetic algorithms in search, optimization, and machine learning
.
Boston, MA
:
Addison-Wesley
; 
1989
.
3.
Kim
HD
,
Shay
T
,
O'Shea
EK
,
Regev
A
. 
Transcriptional regulatory circuits: predicting numbers from alphabets
.
Science
2009
;
325
:
429
32
.
4.
Pieroni
E
,
de la Fuente van Bentem
S
,
Mancosu
G
,
Capobianco
E
,
Hirt
H
,
de la Fuente
A
. 
Protein networking: insights into global functional organization of proteomes
.
Proteomics
2008
;
8
:
799
816
.
5.
Bauer-Mehren
A
,
Furlong
LI
,
Sanz
F
. 
Pathway databases and tools for their exploitation: benefits, current limitations and challenges
.
Mol Syst Biol
2009
;
5
:
290
.
6.
Cusick
ME
,
Yu
H
,
Smolyar
A
,
Venkatesan
K
,
Carvunis
A-R
,
Simonis
N
, et al
Literature-curated protein interaction datasets
.
Nat Meth
2009
;
6
:
39
46
.
7.
Przytycka
TM
,
Singh
M
,
Slonim
DK
. 
Toward the dynamic interactome: it's about time
.
Brief Bioinformatics
2010
;
11
:
15
29
.
8.
Cusick
ME
,
Klitgord
N
,
Vidal
M
,
Hill
DE
. 
Interactome: gateway into systems biology
.
Hum Mol Genet
2005
;
14
Spec No. 2
:
R171
81
.
9.
Carro
MS
,
Lim
WK
,
Alvarez
MJ
,
Bollo
RJ
,
Zhao
X
,
Snyder
EY
, et al
The transcriptional network for mesenchymal transformation of brain tumours
.
Nature
2009
;
463
:
318
25
.
10.
Aldridge
BB
,
Burke
JM
,
Lauffenburger
DA
,
Sorger
PK
. 
Physicochemical modelling of cell signalling pathways
.
Nat Cell Biol
2006
;
8
:
1195
203
.
11.
Parkin
DM
,
Bray
F
,
Ferlay
J
,
Pisani
P
. 
Global cancer statistics, 2002
.
CA Cancer J Clin
2005
;
55
:
74
108
.
12.
Llovet
JM
,
Bruix
J
. 
Molecular targeted therapies in hepatocellular carcinoma
.
Hepatology
2008
;
48
:
1312
27
.
13.
Podolin
PL
,
Callahan
JF
,
Bolognese
BJ
,
Li
YH
,
Carlson
K
,
Davis
TG
, et al
Attenuation of murine collagen-induced arthritis by a novel, potent, selective small molecule inhibitor of IkB kinase 2, TPCA-1 (2-[(aminocarbonyl)amino]-5-(4-fluorophenyl)-3-thiophenecarboxamide), occurs via reduction of proinflammatory cytokines and antigen-induced T cell proliferation
.
J Pharmacol Exp Therap
2005
;
312
:
373
81
.
14.
Birrell
MA
,
Hardaker
E
,
Wong
S
,
McCluskie
K
,
Catley
M
,
De Alba
J
, et al
Ik-B kinase-2 inhibitor blocks inflammation in human airway smooth muscle and a rat model of asthma
.
Am J Respir Crit Care Med
2005
;
172
:
962
71
.
15.
Alexopoulos
LG
,
Saez-Rodriguez
J
,
Cosgrove
B
,
Lauffenburger
DA
,
Sorger
PK
. 
Networks inferred from biochemical data uncover differences in TLR signaling between normal and transformed hepatocytes
.
Mol Cell Proteomics
2010
;
9
:
1849
65
.
16.
Lee
HC
,
Tian
B
,
Sedivy
JM
,
Wands
JR
,
Kim
M
. 
Loss of Raf kinase inhibitor protein promotes cell proliferation and migration of human hepatoma cells
.
Gastroenterology
2006
;
131
:
1208
17
.
18.
Saez-Rodriguez
J
,
Goldsipe
A
,
Muhlich
J
,
Alexopoulos
LG
,
Millard
B
,
Lauffenburger
DA
, et al
Flexible informatics for linking experimental data to mathematical models via DataRail
.
Bioinformatics
2008
;
24
:
840
7
.
19.
DataRail. Available from
: http://code.google.com/p/sbpipeline.
20.
Saez-Rodriguez
J
,
Mirschel
S
,
Hemenway
R
,
Klamt
S
,
Gilles
ED
,
Ginkel
M
. 
Visual setup of logical models of signaling and regulatory networks with ProMoT
.
BMC Bioinformatics
2006
;
7
:
506
.
21.
Ingenuity. Available from
: http://www.ingenuity.com.
23.
Gaudet
S
,
Janes
KA
,
Albeck
JG
,
Pace
EA
,
Lauffenburger
DA
,
Sorger
PK
. 
A compendium of signals and responses triggered by prodeath and prosurvival cytokines
.
Mol Cell Proteomics
2005
;
4
:
1569
90
.
24.
Hopkins
AL
,
Groom
CR
. 
The druggable genome
.
Nat Rev Drug Discov
2002
;
1
:
727
30
.
25.
Fuchs
BC
,
Fujii
T
,
Dorfman
JD
,
Goodwin
JM
,
Zhu
AX
,
Lanuti
M
, et al
Epithelial-to-mesenchymal transition and integrin-linked kinase mediate sensitivity to epidermal growth factor receptor inhibition in human hepatoma cells
.
Cancer Res
2008
;
68
:
2391
9
.
26.
Lee
J-S
,
Thorgeirsson
SS
. 
Functional and genomic implications of global gene expression profiles in cell lines from human hepatocellular cancer
.
Hepatology
2002
;
35
:
1134
43
.
27.
Yuzugullu
H
,
Benhaj
K
,
Ozturk
N
,
Senturk
S
,
Celik
E
,
Toylu
A
, et al
Canonical Wnt signaling is antagonized by noncanonical Wnt5a in hepatocellular carcinoma cells
.
Mol Cancer
2009
;
8
:
90
.
28.
Yasuda
E
,
Kumada
T
,
Takai
S
,
Ishisaki
A
,
Noda
T
,
Matsushima-Nishiwaki
R
, et al
Attenuated phosphorylation of heat shock protein 27 correlates with tumor progression in patients with hepatocellular carcinoma
.
Biochem Biophys Res Commun
2005
;
337
:
337
42
.
29.
Frame
S
,
Cohen
P
. 
GSK3 takes centre stage more than 20 years after its discovery
.
Biochem J
2001
;
359
:
1
16
.
30.
Khamzina
L
,
Gruppuso
PA
,
Wands
JR
. 
Insulin signaling through insulin receptor substrate 1 and 2 in normal liver development
.
Gastroenterology
2003
;
125
:
572
85
.
31.
Saad
MJA
,
Araki
E
,
Miralpeix
M
,
Rothenberg
PL
,
White
MF
,
Kahn
CR
. 
Regulation of insulin-receptor substrate-1 in liver and muscle of animal-models of insulin resistance
.
J Clin Invest
1992
;
90
:
1839
49
.
32.
Taniguchi
CM
,
Winnay
J
,
Kondo
T
,
Bronson
RT
,
Guimaraes
AR
,
Alemán
JO
, et al
The phosphoinositide 3-kinase regulatory subunit p85alpha can exert tumor suppressor properties through negative regulation of growth factor signaling
.
Cancer Res
2010
;
70
:
5305
15
.
33.
Bamborough
P
,
Drewry
D
,
Harper
G
,
Smith
GK
,
Schneider
K
. 
Assessment of chemical coverage of kinome space and its implications for kinase drug discovery
.
J Med Chem
2008
;
51
:
7898
914
.
34.
Petrelli
A
,
Giordano
S
. 
From single- to multi-target drugs in cancer therapy: when aspecificity becomes an advantage
.
Curr Med Chem
2008
;
15
:
422
32
.
35.
Berger
S
,
Iyengar
R
. 
Network analyses in systems pharmacology
.
Bioinformatics
2009
;
25
:
2466
72
.
36.
Nelander
S
,
Wang
W
,
Nilsson
B
,
She
Q-B
,
Pratilas
C
,
Rosen
N
, et al
Models from experiments: combinatorial drug perturbations of cancer cells
.
Mol Syst Biol
2008
;
4
:
11
.
37.
Mendoza
L
,
Xenarios
I
. 
A method for the generation of standardized qualitative dynamical systems of regulatory networks
.
Theor Biol Med Model
2006
;
3
:
13
.
38.
Wittmann
DM
,
Krumsiek
J
,
Saez-Rodriguez
J
,
Lauffenburger
DA
,
Klamt
S
,
Theis
FJ
. 
Transforming Boolean models to continuous models: methodology and application to T-cell receptor signaling
.
BMC Syst Biol
2009
;
3
:
98
.
39.
Sachs
K
,
Gifford
D
,
Jaakkola
T
,
Sorger
P
,
Lauffenburger
DA
. 
Bayesian network approach to cell signaling pathway modeling
.
Sci STKE
2002
;
2002:pe38.
40.
Aldridge
BB
,
Saez-Rodriguez
J
,
Muhlich
JL
,
Sorger
PK
,
Lauffenburger
DA
. 
Fuzzy logic analysis of kinase pathway crosstalk in TNF/EGF/insulin-induced signaling
.
PLoS Comput Biol
2009
;
5
:
e1000340
.
41.
Prill
RJ
,
Marbach
D
,
Saez-Rodriguez
J
,
Sorger
PK
,
Alexopoulos
LG
,
Xue
X
, et al
Towards a rigorous assessment of systems biology models: The DREAM3 Challenges
.
PLoS One
2010
;
5
:
e9202
.
42.
Ciaccio
MF
,
Wagner
JP
,
Chuu
C-P
,
Lauffenburger
DA
,
Jones
RB
. 
Systems analysis of EGF receptor signaling dynamics with microwestern arrays
.
Nat Methods
2010
;
7
:
148
55
.
43.
Gstaiger
M
,
Aebersold
R
. 
Applying mass spectrometry-based proteomics to genetics, genomics and network biology
.
Nat Rev Genet
2009
;
10
:
617
27
.
44.
Jørgensen
C
,
Sherman
A
,
Chen
GI
,
Pasculescu
A
,
Poliakov
A
,
Hsiung
M
, et al
Cell-specific information processing in segregating populations of Eph receptor ephrin-expressing cells
.
Science
2009
;
326
:
1502
9
.
45.
Jones
S
,
Zhang
X
,
Parsons
DW
,
Lin
JC-H
,
Leary
RJ
,
Angenendt
P
, et al
Core signaling pathways in human pancreatic cancers revealed by global genomic analyses
.
Science
2008
;
321
:
1801
6
.
46.
Bonnet
E
,
Michoel
T
,
Van de Peer
Y
. 
Prediction of a gene regulatory network linked to prostate cancer from gene expression, microRNA and clinical data
.
Bioinformatics
2010
;
26
:
i638
44
.
47.
Chuang
H-Y
,
Lee
E
,
Liu
Y-T
,
Lee
D
,
Ideker
T
. 
Network-based classification of breast cancer metastasis
.
Mol Syst Biol
2007
;
3
:
10
.
48.
GraphViz
.
Available from:
http://www.graphviz.org.
49.
Shannon
P
,
Markiel
A
,
Ozier
O
,
Baliga
NS
,
Wang
JT
,
Ramage
D
, et al
Cytoscape: a software environment for integrated models of biomolecular interaction networks
.
Genome Res
2003
;
13
:
2498
504
.