Abstract
Purpose: Complete response to induction chemotherapy is observed in ∼60% of patients with newly diagnosed non-M3 acute myelogenous leukemia (AML). However, no methods exist to predict with high accuracy at the individual patient level the response to standard AML induction therapy.
Experimental Design: We applied single-cell network profiling (SCNP) using flow cytometry, a tool that allows a comprehensive functional assessment of intracellular signaling pathways in heterogeneous tissues, to two training cohorts of AML samples (n = 34 and 88) to predict the likelihood of response to induction chemotherapy.
Results: In the first study, univariate analysis identified multiple signaling “nodes” (readouts of modulated intracellular signaling proteins) that correlated with response (i.e., AUCROC ≥ 0.66; P ≤ 0.05) at a level greater than age. After accounting for age, similar findings were observed in the second study. For patients <60 years old, complete response was associated with the presence of intact apoptotic pathways. In patients ≥60 years old, nonresponse was associated with FLT3 ligand–mediated increase in phosphorylated Akt and phosphorylated extracellular signal-regulated kinase. Results were independent of cytogenetics, FLT3 mutational status, and diagnosis of secondary AML.
Conclusions: These data emphasize the value of performing quantitative SCNP under modulated conditions as a basis for the development of tests highly predictive for response to induction chemotherapy. SCNP provides information distinct from other known prognostic factors such as age, secondary AML, cytogenetics, and molecular alterations and is potentially combinable with the latter to improve clinical decision making. Independent validation studies are warranted. Clin Cancer Res; 16(14); 3721–33. ©2010 AACR.
This article is featured in Highlights of This Issue, p. 3521
Current acute myelogenous leukemia (AML) prognostic markers are based on clinical characterization, such as age and performance status, or static measurements of leukemia biology at diagnosis, such as cytogenetics and molecular events (e.g., FLT3 ITD and NPM1 mutations). Although these methods offer directionally predictive information on disease outcomes, their accuracy is suboptimal, supporting further improvements. Single-cell network profiling (SCNP) is a tool that allows a comprehensive functional assessment of biologically relevant signaling pathways at the single-cell level in potentially heterogeneous tissues. This study shows SCNP as a new way to characterize AML based on single patient disease biology, making the assay a potentially valuable tool in guiding clinical decision making.
Acute myelogenous leukemia (AML) displays biological and clinical heterogeneity due to a complex range of cytogenetic and molecular aberrations that result in downstream effects on gene expression, protein function, and cell signal transduction pathways, ultimately affecting proliferation, survival, and cellular differentiation (1–4). Historically, morphologic and cytochemical methods have formed the basis for AML classification (3), although they fail to adequately inform therapeutic decision making for most patients. Other methods such as cytogenetics (5–7), gene expression profiling (8, 9), microRNA profiling (10, 11), epigenetic profiling (12), and proteomic profiling (13, 14) have been used to elucidate the biological and clinical heterogeneity of AML, and some of the molecular changes identified in these studies have now shown to be associated with disease outcomes (15–31). Karyotype, NMP1 gene mutation, and overexpression of the brain and acute leukemia cytoplasmic (BAALC) and meningioma 1 (MN1) genes at presentation have been shown in different studies to be associated with response to induction therapy (32–34), and the first two markers are currently considered in the treatment decision-making process for non-M3 AML patients, particularly if 60 years of age or older. However, the association of those markers with patient outcomes is not perfect and there is room for further improvement. Because chromosomal, genetic, epigenetic, and other molecular alterations converge at the level of protein function and cell signaling pathways, we reasoned that tools assessing this aspect of disease biology will potentially have a high predictive value. Basal protein expression profiling patterns in AML as measured by reverse-phase protein arrays were recently shown to correlate with known morphologic features, cytogenetics, remission, relapse, and overall survival (13). Although these studies show high sensitivity and reproducibility for baseline measurements of protein levels at the individual patient level, they do not provide an evaluation of dynamic protein responses to external stimuli in specific cell subpopulations (such as leukemic stem cells) that are present in a heterogeneous population of AML cells from bone marrow or peripheral blood.
Single-cell network profiling (SCNP), using flow cytometry, characterizes cell signaling on exposure of cells to extracellular modulators, revealing network properties that would not be seen in resting cells, thus allowing for detection of functional heterogeneity between AML samples as well as within an AML sample (4). Pathway responses can include failure to become activated, hypersensitivity/hyposensitivity of the pathway to modulators, altered response kinetics, and rewiring of canonical pathways. This method of mapping signaling networks has potential applications in clinical medicine with the development of tests predictive of therapeutic response and in drug development (e.g., when applied to pathways shown to be important in disease pathology) to improve the overall efficiency of the process (4, 35–37).
In the current study, using two sequential training cohorts, SCNP was used to do a comprehensive analysis of modulated Janus-activated kinase (JAK)/signal transducer and activator of transcription (STAT), phosphatidylinositol 3-kinase (PI3K) pathways, phosphatase activation, and apoptosis signaling in AML blasts to identify most important proteomic profiles associated with disease response to AML induction chemotherapy.
Materials and Methods
Patient samples
In accordance with the Declaration of Helsinki, all patients provided informed consent for the collection and use of their samples for research purposes. Each study was approved by the Institutional Review Board of the respective institution. Clinical data were deidentified in compliance with Health Insurance Portability and Accountability Act regulations. Sample inclusion criteria included diagnosis of non-M3 AML (note that M3 AML patients receive a different standard of therapy), collection before initiation of induction chemotherapy, and availability of disease and treatment annotations. All samples underwent Ficoll-Hypaque fractionation before cryopreservation in FCS and 10% DMSO and storage at liquid nitrogen temperature.
The first sample set consisted of 35 cryopreserved peripheral blood mononuclear cell (PBMC) samples collected from AML patients treated at hospitals affiliated with the University Health Network [Princess Margaret Hospital (PMH)/UHN], University of Toronto, between September 1998 and September 2007. Induction chemotherapy consisted of one cycle of standard cytarabine-based induction therapy (daunorubicin, 60 mg/m2 × 3 days; cytarabine, 100-200 mg/m2 continuous infusion × 7 days). Response to therapy was measured after one cycle of induction therapy. The second sample set consisted of 134 cryopreserved bone marrow mononuclear cell (BMMC) samples collected from AML patients treated at M.D. Anderson Cancer Center (MDACC) between September 1999 and September 2006. Induction chemotherapy consisted of one or two cycles of cytarabine (200 mg/m2 to 3 g/m2) in combination with an anthracycline (daunorubicin or idarubicin) or an additional antimetabolite (e.g., fludarabine or troxacitabine), and sometimes an experimental agent (Table 1). Best response was measured after completion of induction therapy (>90% received one cycle, remaining two cycles).
Characteristic . | Study 1 . | Study 2 . | ||||||
---|---|---|---|---|---|---|---|---|
CR patients . | NR patients . | All patients . | P* . | CR patients . | NR patients . | All patients . | P* . | |
n | 9 | 25 | 34 | 57 | 31 | 88 | ||
Age (y) | ||||||||
Median | 57 | 47.4 | 49.1 | 0.084 | 51.2 | 61.6 | 55.2 | 0.004 |
Range | 38.2-74.8 | 20.7-70.2 | 20.7-74.8 | 27.0-79.0 | 25.0-76.3 | 25.0-79.0 | ||
Age group (y) | ||||||||
<60 | 5 (56%) | 20 (80%) | 25 (74%) | 0.201 | 51 (89%) | 15 (48%) | 66 (75%) | <0.001 |
≥60 | 4 (44%) | 5 (20%) | 9 (26%) | 6 (11%) | 16 (52%) | 22 (25%) | ||
Sex | ||||||||
F | 7 (78%) | 14 (56%) | 21 (62%) | 0.427 | 32 (56%) | 16 (52%) | 48 (55%) | 0.823 |
M | 2 (22%) | 11 (44%) | 13 (38%) | 25 (44%) | 15 (48%) | 40 (45%) | ||
Cytogentic group | ||||||||
Favorable | 0 (0%) | 1 (4%) | 1 (3%) | 0.639 | 7 (12%) | 0 (0%) | 7 (8%) | 0.004 |
Intermediate | 8 (89%) | 18 (72%) | 26 (76%) | 29 (51%) | 9 (29%) | 38 (43%) | ||
Unfavorable | 0 (0%) | 3 (12%) | 3 (9%) | 21 (37%) | 22 (71%) | 43 (49%) | ||
Not done | 1 (11%) | 3 (12%) | 4 (12%) | 0 (0%) | 0 (0%) | 0 (0%) | ||
FAB | ||||||||
M0 | 0 (0%) | 2 (8%) | 2 (6%) | 0.474 | 1 (2%) | 1 (3%) | 2 (2%) | 0.794 |
M1 | 2 (22%) | 2 (8%) | 4 (12%) | 8 (14%) | 1 (3%) | 9 (10%) | ||
M2 | 1 (11%) | 5 (20%) | 6 (18%) | 22 (39%) | 14 (45%) | 36 (41%) | ||
M4 | 1 (11%) | 7 (28%) | 8 (24%) | 14 (25%) | 8 (26%) | 22 (25%) | ||
M5 | 3 (33%) | 2 (8%) | 5 (15%) | 8 (14%) | 4 (13%) | 12 (14%) | ||
M6 | 0 (0%) | 0 (0%) | 0 (0%) | 2 (4%) | 2 (6%) | 4 (5%) | ||
Other and unknown | 2 (22%) | 7 (28%) | 9 (27%) | 2 (4%) | 1 (3%) | 3 (3%) | ||
Race | ||||||||
White | 3 (33%) | 17 (68%) | 20 (59%) | 0.201 | 15 (26%) | 15 (48%) | 30 (34%) | 0.127 |
Asian | 5 (56%) | 5 (20%) | 10 (29%) | 1 (2%) | 1 (3%) | 2 (2%) | ||
Other† | 1 (11%) | 2 (8%) | 3 (9%) | 10 (18%) | 1 (3%) | 11 (13%) | ||
Unknown | 0 (0%) | 1 (4%) | 1 (3%) | 31 (54%) | 14 (45%) | 45 (51%) | ||
FLT3 ITD | ||||||||
Negative | 4 (44%) | 14 (56%) | 18 (53%) | 0.641 | 44 (77%) | 23 (74%) | 67 (76%) | 0.477 |
Positive | 5 (56%) | 10 (40%) | 15 (44%) | 11 (19%) | 5 (16%) | 16 (18%) | ||
Unknown | 0 (0%) | 1 (4%) | 1 (3%) | 2 (4%) | 3 (10%) | 5 (3%) | ||
Secondary AML | ||||||||
No | 8 (89%) | 25 (100%) | 33 (97%) | 0.265 | 47 (82%) | 14 (45%) | 61 (69%) | <0.001 |
Yes | 1 (11%) | 0 (0%) | 1 (3%) | 10 (18%) | 17 (55%) | 27 (31%) | ||
Poor prognosis‡ | ||||||||
No | 5 (56%) | 18 (72%) | 23 (68%) | 0.425 | 22 (39%) | 3 (10%) | 25 (28%) | 0.004 |
Yes | 4 (44%) | 7 (28%) | 11 (32%) | 35 (61%) | 28 (90%) | 63 (72%) | ||
Induction therapy | ||||||||
Standard 3 + 7 | 9 (100%) | 25 (100%) | 34 (100%) | n/a | 0 (0%) | 0 (0%) | 0 (0%) | |
Fludarabine + HDAC | 0 (0%) | 0 (0%) | 0 (0%) | 11 (19%) | 2 (6%) | 13 (15%) | 0.222 | |
IA + Zarnestra | 0 (0%) | 0 (0%) | 0 (0%) | 18 (32%) | 9 (29%) | 27 (31%) | ||
IDA + HDAC | 0 (0%) | 0 (0%) | 0 (0%) | 17 (30%) | 9 (29%) | 26 (30%) | ||
Other | 0 (0%) | 0 (0%) | 0 (0%) | 11 (19%) | 11 (35%) | 22 (25%) |
Characteristic . | Study 1 . | Study 2 . | ||||||
---|---|---|---|---|---|---|---|---|
CR patients . | NR patients . | All patients . | P* . | CR patients . | NR patients . | All patients . | P* . | |
n | 9 | 25 | 34 | 57 | 31 | 88 | ||
Age (y) | ||||||||
Median | 57 | 47.4 | 49.1 | 0.084 | 51.2 | 61.6 | 55.2 | 0.004 |
Range | 38.2-74.8 | 20.7-70.2 | 20.7-74.8 | 27.0-79.0 | 25.0-76.3 | 25.0-79.0 | ||
Age group (y) | ||||||||
<60 | 5 (56%) | 20 (80%) | 25 (74%) | 0.201 | 51 (89%) | 15 (48%) | 66 (75%) | <0.001 |
≥60 | 4 (44%) | 5 (20%) | 9 (26%) | 6 (11%) | 16 (52%) | 22 (25%) | ||
Sex | ||||||||
F | 7 (78%) | 14 (56%) | 21 (62%) | 0.427 | 32 (56%) | 16 (52%) | 48 (55%) | 0.823 |
M | 2 (22%) | 11 (44%) | 13 (38%) | 25 (44%) | 15 (48%) | 40 (45%) | ||
Cytogentic group | ||||||||
Favorable | 0 (0%) | 1 (4%) | 1 (3%) | 0.639 | 7 (12%) | 0 (0%) | 7 (8%) | 0.004 |
Intermediate | 8 (89%) | 18 (72%) | 26 (76%) | 29 (51%) | 9 (29%) | 38 (43%) | ||
Unfavorable | 0 (0%) | 3 (12%) | 3 (9%) | 21 (37%) | 22 (71%) | 43 (49%) | ||
Not done | 1 (11%) | 3 (12%) | 4 (12%) | 0 (0%) | 0 (0%) | 0 (0%) | ||
FAB | ||||||||
M0 | 0 (0%) | 2 (8%) | 2 (6%) | 0.474 | 1 (2%) | 1 (3%) | 2 (2%) | 0.794 |
M1 | 2 (22%) | 2 (8%) | 4 (12%) | 8 (14%) | 1 (3%) | 9 (10%) | ||
M2 | 1 (11%) | 5 (20%) | 6 (18%) | 22 (39%) | 14 (45%) | 36 (41%) | ||
M4 | 1 (11%) | 7 (28%) | 8 (24%) | 14 (25%) | 8 (26%) | 22 (25%) | ||
M5 | 3 (33%) | 2 (8%) | 5 (15%) | 8 (14%) | 4 (13%) | 12 (14%) | ||
M6 | 0 (0%) | 0 (0%) | 0 (0%) | 2 (4%) | 2 (6%) | 4 (5%) | ||
Other and unknown | 2 (22%) | 7 (28%) | 9 (27%) | 2 (4%) | 1 (3%) | 3 (3%) | ||
Race | ||||||||
White | 3 (33%) | 17 (68%) | 20 (59%) | 0.201 | 15 (26%) | 15 (48%) | 30 (34%) | 0.127 |
Asian | 5 (56%) | 5 (20%) | 10 (29%) | 1 (2%) | 1 (3%) | 2 (2%) | ||
Other† | 1 (11%) | 2 (8%) | 3 (9%) | 10 (18%) | 1 (3%) | 11 (13%) | ||
Unknown | 0 (0%) | 1 (4%) | 1 (3%) | 31 (54%) | 14 (45%) | 45 (51%) | ||
FLT3 ITD | ||||||||
Negative | 4 (44%) | 14 (56%) | 18 (53%) | 0.641 | 44 (77%) | 23 (74%) | 67 (76%) | 0.477 |
Positive | 5 (56%) | 10 (40%) | 15 (44%) | 11 (19%) | 5 (16%) | 16 (18%) | ||
Unknown | 0 (0%) | 1 (4%) | 1 (3%) | 2 (4%) | 3 (10%) | 5 (3%) | ||
Secondary AML | ||||||||
No | 8 (89%) | 25 (100%) | 33 (97%) | 0.265 | 47 (82%) | 14 (45%) | 61 (69%) | <0.001 |
Yes | 1 (11%) | 0 (0%) | 1 (3%) | 10 (18%) | 17 (55%) | 27 (31%) | ||
Poor prognosis‡ | ||||||||
No | 5 (56%) | 18 (72%) | 23 (68%) | 0.425 | 22 (39%) | 3 (10%) | 25 (28%) | 0.004 |
Yes | 4 (44%) | 7 (28%) | 11 (32%) | 35 (61%) | 28 (90%) | 63 (72%) | ||
Induction therapy | ||||||||
Standard 3 + 7 | 9 (100%) | 25 (100%) | 34 (100%) | n/a | 0 (0%) | 0 (0%) | 0 (0%) | |
Fludarabine + HDAC | 0 (0%) | 0 (0%) | 0 (0%) | 11 (19%) | 2 (6%) | 13 (15%) | 0.222 | |
IA + Zarnestra | 0 (0%) | 0 (0%) | 0 (0%) | 18 (32%) | 9 (29%) | 27 (31%) | ||
IDA + HDAC | 0 (0%) | 0 (0%) | 0 (0%) | 17 (30%) | 9 (29%) | 26 (30%) | ||
Other | 0 (0%) | 0 (0%) | 0 (0%) | 11 (19%) | 11 (35%) | 22 (25%) |
NOTE: All 25 NRs in study 1 have primary refractory AML. There are 25 primary refractory AML and 6 patients who died during induction treatment (i.e., failure) in study 2.
*The two-sample t test was used to compare mean (data not shown) ages of CR and NR patients. Fisher's exact test was used to compare CR and NR patients with respect to categorical variables with two levels. The standard χ2 test was used to compare CR and NR patients with respect to categorical variables with three or more levels.
†The other values for race are based on Black and Hispanic subgroups.
‡Poor prognosis is defined as having one or more of the following high-risk features: age ≥60 y, unfavorable cytogenetics, FLT3 ITD positive, or secondary AML.
Standard clinical and laboratory criteria were used for defining complete response (CR) in both studies (38). Leukemia samples obtained from patients who did not meet the criteria for CR or samples obtained from those who died during induction therapy were considered non-CR [i.e., nonresponse (NR) for the analyses]. Each study had one patient that met all the criteria for a clinical CR with the exception of platelet recovery (CRp). These CRp samples were included in the CR group for the analysis. In order for patient samples to be included in the analysis (evaluable), >500 viable cells in the leukemic cell population (defined below) per condition were required. Thirty-four of 35 samples were evaluable for study 1 and 88 of 134 samples for study 2 (Table 1).
Study design
Two prospectively designed training studies were conducted sequentially with archived, cryopreserved, clinically annotated diagnostic AML samples. The first was a smaller study that was unintentionally enriched for patients of younger age with de novo AML and primary refractory disease. The sample size for the first study was driven by availability of samples in the UHN tissue bank meeting patient and sample requirements described above. The second study was larger and included AML samples collected from an AML patient population more representative, in terms of baseline disease characteristics, of the U.S. AML population. Based on the data from the first study, we estimated that with ∼40 patients for each treatment outcome group (CR and NR), the second study would have >0.95 power at a significance level of 0.05. The sample size of 134 was based on an expected ratio of 2:1 for CR to NR patients plus 10% overage. False discovery rate analysis was done in both studies to estimate the rate of chance correlation in the data sets. All assays were conducted in a blinded fashion to clinical outcomes.
Pathways evaluated
Four groups of cellular functions (Fig. 1), chosen based on their relevance to AML pathophysiology, were evaluated:
Response to chemokines, cytokines, and growth factor (“CCG” pathways): Modulated cell signaling pathways known to be altered in hematologic malignancies and for which commercial reagents exist were measured. These included stem cell factor (SCF) and FLT3 ligand (FLT3L)–mediated PI3K/Akt activation (important for maintaining the hematopoietic stem cell pool; refs. 39, 40) and phospholipase Cγ (PLCγ)/cyclic AMP–responsive element binding protein (CREB) pathways; granulocyte colony-stimulating factor (G-CSF)–mediated JAK/STAT activation (important for neutrophilic differentiation of hematopoietic progenitor cells; ref. 41); interleukin (IL)-6 family members, including IL-27–mediated JAK/STAT and CREB activation (important in regulating proliferation, differentiation, and functional maturation of cells belonging to multiple hematopoietic lineages; ref. 42); and IL-10–mediated JAK/STAT activation (important in modulating the immune response of monocytes and macrophages and shown to play a role in AML blast proliferation; ref. 43).
Phosphatase activity: The role of phosphatases in signaling regulation was determined through the use of H2O2, an intracellular second messenger and general tyrosine phosphatase inhibitor (44), used as a single agent or in combination with another modulator (44). Of note, H2O2 has also been described as having effects via generation of reactive oxygen species.
Expression of surface proteins: Expression of drug transporter proteins, known to be associated with adverse prognosis in AML (45, 46), and surface myeloid growth factor receptors, such as c-Kit and FLT3R, was measured.
Apoptosis: Transformed cells evade apoptosis by activating survival pathways and/or disabling apoptotic pathways. Caspase-dependent apoptosis pathways were measured after in vitro exposure of AML samples to etoposide or 1-β-d-arabinofuranosylcytosine (ara-C)/daunorubicin and staurosporine (in the presence and absence of ZVAD).
SCNP assay terminology
The term “signaling node” is used to refer to a proteomic readout in the presence or absence of a specific modulator. For example, the response to G-CSF stimulation can be measured using phosphorylated (p)-STAT5 as a readout. That signaling node is designated “G-CSF→p-STAT5.” Several metrics (normalized assay readouts defined below and summarized in Fig. 2) are applied to interpret the functionality and biology of each signaling node and are referenced following the node (e.g., “G-CSF→p-STAT5 ∣ Fold,” “G-CSF→p-STAT5 ∣ Total,” or “p-STAT5 ∣ Basal.”
A total of 147 nodes were evaluated in study 1 (18 basal states, 8 surface markers, and 121 modulated readouts; see pathway evaluation above). Based on findings from study 1 (specifically identification of nonfunctional signaling nodes), the number of signaling nodes evaluated in study 2 was reduced to 90 (16 basal states, 5 surface markers, and 69 modulated readouts; Supplementary Table S1). Each node was evaluated with 1 to 3 metrics, for a total of 304 and 182 node/metrics, respectively.
Sample cell recovery and viability
Cell recovery and viability after cryopreservation and thaw were variable. For study 1, ∼6.8 × 106 cells were required to interrogate all 147 nodes for each patient. All 34 evaluable patient samples had sufficient cells to measure 101 nodes; 26 patient samples had sufficient cells to measure all of the additional 46 nodes. For study 2, ∼4.7 × 106 cells per sample were required to measure all 90 nodes. However, several samples had far fewer cells than required. Consequently, depending on the node/metric, the number of patients for which data were available varied between 88 and 9. Of note, the number of cells available for each sample was not correlated with patient age, blast count, cytogenetic group, or clinical response (data not shown). The numbers of donors available to do analyses are documented in relevant tables.
SCNP assay
SCNP assays were done as described previously (4). Cryopreserved samples were thawed at 37°C, washed, and centrifuged in PBS, 10% fetal bovine serum (FBS), and 2 mmol/L EDTA. The cells were resuspended, filtered to remove debris, and washed in RPMI 1640/1% FBS before staining with Aqua Viability Dye to distinguish nonviable cells. The cells were resuspended in RPMI 1640/1% FBS, aliquoted to 100,000 cells per condition, and rested for 1 to 2 hours at 37°C. For apoptosis assays, cells were incubated with cytotoxic drugs for 6 hours (e.g., staurosporine) or 24 hours (e.g., etoposide or ara-C and daunorubicin) and restained with Aqua Viability Dye. For all other assays, cells were incubated with modulators (Supplementary Table S2A) at 37°C for 3 to 15 minutes. After exposure to modulators, cells were fixed with 1.6% paraformaldehyde (final concentration) for 10 minutes at 37°C, pelleted and permeabilized with 100% ice-cold methanol, and stored at −80°C overnight. Subsequently, cells were washed with fluorescence-activated cell sorting buffer (PBS/0.5% bovine serum albumin/0.05% NaN3), pelleted, and stained with cocktails of fluorochrome-conjugated antibodies (Supplementary Table S2B). These cocktails included antibodies against two to five phenotypic markers for gating cell populations (e.g., CD45 and CD33), up to three antibodies against intracellular signaling molecules, or against surface markers for an eight-color flow cytometry assay. Isotype controls or phosphopeptide blocking experiments were done to qualify phospho-antibodies.
Flow cytometry data acquisition and analysis
Flow cytometry data were acquired on an LSR II and/or FACSCanto II flow cytometer using the FACSDiva software (BD Biosciences). All flow cytometry data were analyzed with FlowJo (TreeStar Software) or WinList (Verity House Software). Dead cells and debris were excluded by forward scatter, side scatter, and Amine Aqua Viability Dye measurement. All analyses were based on leukemic cells (20-95% of a given cell preparation), which were identified as cells fitting the CD45 and CD33 versus right-angle light scatter characteristics, consistent with myeloid leukemia cells and lacking the characteristics of mature lymphocytes (CD45+, CD33−; ref. 47).
Metrics, statistical methods, and stratifying node analysis
Metrics
Several metrics were developed to measure the biology of functional signaling proteins (Fig. 2; Supplementary Fig. S1). To measure basal levels of signaling in the resting, unmodulated state, the “Basal” metric was applied. With modulation, the “Fold” metric identifies the inducibility or responsiveness of a protein or pathway. The “Total” metric was developed to assess the magnitude of total activated protein. Total incorporates both basal and induced pathway activation and is more relevant in measuring pathways regulated by activity thresholds.
For surface markers, the Relative Protein Expression (“Rel. Expression”) was used to measure the amount of surface expression, and the Percent Positive (“PercentPos”) was used to quantify the frequency of cells positive for a surface marker relative to a control antibody.
For apoptosis conditions, the percentage of cells in a two-dimensional flow plot quadrant “Quad” region [i.e., defined by low levels of p-Chk2 (measuring DNA damage response) and high levels of caspase product cleaved poly(ADP-ribose) polymerase (PARP; measuring cell death); p-Chk2−,c-PARP+ quadrant] was used to quantify levels of cellular apoptosis in response to cytotoxic drugs.
Reproducibility
Cell lines, healthy BMMC, and healthy PBMC were included as controls to monitor assay performance in both studies. Two vials of cryopreserved cells were available for each evaluable patient sample (n = 34) in study 1, thus allowing assessment of reproducibility; duplicate vials were processed on separate days. Reproducibility for a total of 62 CCG node/metrics was assessed by calculating Pearson correlation coefficients (R) on replicate assays. Limited numbers of cells in vial 2 precluded the comparison of all conditions; therefore, apoptosis and surface markers were not included in run 2.
Correlations between node/metrics
Pearson correlation coefficients were computed between all pairs of node/metrics. In addition, for reproducibility evaluation, Spearman correlation is provided to better assess the possible effect of outliers.
Association between node/metric and clinical response—univariate analysis
All node/metrics (n = 304 in study 1; n = 182 in study 2) were independently tested for their ability to classify patients based on their disease response to standard induction therapy. Due to the small sample size and nonnormal distribution (based on visual inspection) of some node/metrics, both Student's t test and Wilcoxon P values were computed. False discovery rate and overall significance of the number of node/metrics found to classify at a given P value were addressed through simulations described in Supplementary Materials and Methods. Simulation methods were used in preference to a multitest correction because of high correlations between many node/metrics and a higher tolerance for false-positive results in these early training studies mainly focused on the reduction of candidate stratifying nodes rather than on the selection of a specific classifier. The area under the curve of the receiver operator characteristic (AUCROC; refs. 48–50) was computed to assess classification accuracy of each node.
Association between multiple node/metric and clinical response—multivariate analysis
We explored combining node/metrics into a multivariate classifier with improved outcome classification ability using a simple rule-based approach combining pairs or triplets of individual stratifying node/metrics independent of each other. Given the limited size of this data set, this modeling exercise was done to explore potential combinations within or across pathways that might be of interest in future studies. In brief, this method divides subjects into two classes (dichotomized response to induction therapy) using one, two, or three node/metrics. For individual node/metrics, a threshold or cutoff value was selected that correctly classifies all CRs. These thresholds are then combined logically in pairs or triplets to create regions in either two or three dimensional. These two- or three-dimensional regions are expected to include most members of the CRs and to exclude most of the NRs. Because of the significant differences in the number of donors for which data were available for different nodes (due to limited cell recovery), multivariate analysis was not deemed appropriate for study 2.
Results
Study 1
Patient and sample characteristics
Thirty-four of the 35 cryopreserved AML PBMC samples in the study were evaluable after thawing. This sample set was chosen based on the availability of a large number of cryopreserved cells collected at the time of diagnosis. This created a bias toward patients with high initial white blood counts and, hence, an overall worse prognosis. In comparison with the general AML patient population, this group was biased toward younger patients (<60 years), female, of Asian race (29%), intermediate-risk cytogenetics (76%), and NRs after induction chemotherapy (Table 1). Ten of 18 (56%) cytogenetically normal samples contained a FLT3 ITD mutation, indicating a poor-prognosis patient group (20, 28, 31).
Assay reproducibility
Pearson coefficient was ≥0.8 in 32 of 62 CCG signaling node/metrics of the replicated assays (Supplementary Table S3). Assay reproducibility was highest for those node/metrics with the largest range of signaling [e.g., PMA→p-S6 ∣ Fold (R = 0.95), SCF→p-S6 ∣ Fold (R = 0.91), FLT3L→p-Akt ∣ Fold (R = 0.92), and G-CSF→p-STAT5 ∣ Fold (R = 0.86)] and lower for nodes with low signaling [e.g., SDF1α→p-S6 ∣ Fold (R = 0.12) and IL-27→p-S6 ∣ Fold (R = 0.2)]. Only node/metrics reproducible and differentially associated with CR/NR outcomes were considered good candidate nodes for future clinical assays.
Association between node/metric and clinical response—univariate analysis
Univariate analysis was done on 304 node/metrics for the ability to classify patient response to AML induction therapy, and AUCROC curves for each node/metrics were calculated. Patient age is a known prognostic factor associated with likelihood of AML response to induction therapy (with AUCROC of ∼0.65 in this study). Therefore, node/metrics were considered stratifying only if they had an AUCROC of ≥0.66 and a P value of ≤0.05 using either the Student's t test or the Wilcoxon test. Fifty-eight node/metrics (18% of the node/metrics assessed) met these criteria (Supplementary Table S4). An assessment of false discovery rate (Supplementary Fig. S2A) indicated that the number of “significant” nodes occurring by chance in this data set was <2%. Table 2A shows a summary of these stratifying nodes listed by pathways, whereas Supplementary Table S4 provides the raw supporting data. In study 1, basal levels of a few phosphorylated signaling proteins (n = 5) stratified patients by clinical response to induction therapy as indicated by their AUCROC values: p-CREB ∣ Basal (0.87), phospho–extracellular signal-regulated kinase (p-ERK) ∣ Basal (0.77), p-PLCγ2 ∣ Basal (0.79), p-STAT3 ∣ Basal (0.81), and p-STAT6 ∣ Basal (0.76); specifically, NR samples showed higher basal level of these phosphorylated proteins compared with CR samples (Table 2A). Modulated signaling for four of five of these proteins also classified patient response, and several nodes that did not stratify in the basal state showed correlation to induction therapy response when assessed in the modulated state. For the majority of stratifying nodes (48 of 58), modulation of signaling was required to allow correlation with response. These modulated readouts confirm previous findings that samples from NR patients show increased growth factor–mediated signaling compared with samples from CR patients (4). In addition, etoposide-mediated decreased levels of p-Chk2 and increased levels of c-PARP (etoposide→p-Chk2− and c-PARP+ ∣ Quad) were seen more often in CR samples than in NR samples (0.81; Table 2A).
A. Study 1 . | CR . | NR . |
---|---|---|
Surface markers | ||
ABCG2, c-Kit, FLT3R | — | ↑ |
Basal | ||
p-CREB, p-ERK, p-PLCγ, p-STAT3, p-STAT6 | — | ↑ |
Modulated signaling | ||
Growth factor–mediated signaling | — | ↑ |
G-CSF, GM-CSF→p-STATs | ||
FLT3L→p-Akt, p-CREB, p-S6 | ||
SCF→p-Akt, p-CREB, p-PLCγ2 | ||
Cytokine-mediated signaling | — | ↑ |
IL-6, IL-10, IL-27→p-STATs | ||
IFNα→p-STATs | ||
Apoptosis | ↑ | — |
Etoposide→p-Chk2−,c-PARP+ | ||
B. Study 2 | CR | NR |
Patients <60 y | ||
Modulated signaling | ||
Cytokine-mediated signaling | — | ↑ |
IL-27→p-STATs | ||
IFNα→p-STATs | ||
Apoptosis | ↑ | — |
Etoposide→p-Chk2−,c-PARP+ | ||
Ara-C + daunorubicin→p-Chk2−,c-PARP+* | ||
Patients >60 y | ||
Modulated signaling | ||
Growth factor–mediated signaling | — | ↑ |
FLT3L→p-Akt, p-ERK, p-S6 | ||
SCF→p-S6 | ||
Cytokine-mediated signaling | — | ↑ |
IL-27→p-STATs |
A. Study 1 . | CR . | NR . |
---|---|---|
Surface markers | ||
ABCG2, c-Kit, FLT3R | — | ↑ |
Basal | ||
p-CREB, p-ERK, p-PLCγ, p-STAT3, p-STAT6 | — | ↑ |
Modulated signaling | ||
Growth factor–mediated signaling | — | ↑ |
G-CSF, GM-CSF→p-STATs | ||
FLT3L→p-Akt, p-CREB, p-S6 | ||
SCF→p-Akt, p-CREB, p-PLCγ2 | ||
Cytokine-mediated signaling | — | ↑ |
IL-6, IL-10, IL-27→p-STATs | ||
IFNα→p-STATs | ||
Apoptosis | ↑ | — |
Etoposide→p-Chk2−,c-PARP+ | ||
B. Study 2 | CR | NR |
Patients <60 y | ||
Modulated signaling | ||
Cytokine-mediated signaling | — | ↑ |
IL-27→p-STATs | ||
IFNα→p-STATs | ||
Apoptosis | ↑ | — |
Etoposide→p-Chk2−,c-PARP+ | ||
Ara-C + daunorubicin→p-Chk2−,c-PARP+* | ||
Patients >60 y | ||
Modulated signaling | ||
Growth factor–mediated signaling | — | ↑ |
FLT3L→p-Akt, p-ERK, p-S6 | ||
SCF→p-S6 | ||
Cytokine-mediated signaling | — | ↑ |
IL-27→p-STATs |
NOTE: For detailed information for each study, see Supplementary Tables S5 and S10.
*Ara-C + daunorubicin was only assessed/added for study 2.
Correlations between nodes/metrics
Pearson correlation coefficients were calculated for all pairwise combinations of stratifying node/metrics (AUCROC ≥ 0.66; P ≤ 0.05). For brevity, the results are shown for Fold metric on the CCG readouts (Supplementary Table S5). Higher correlations were observed (and expected) between nodes measuring signaling events in the same pathway, such as FLT3L→p-Akt ∣ Fold and FLT3L→p-S6 ∣ Fold (R = 0.63), suggesting that these nodes measure common biology. By contrast, lower correlations were observed between nodes measuring signaling events in different pathways (such as SCF→p-Akt ∣ Fold, IL-3→STAT3 ∣ Fold (R = 0.01), SDF1α→p-Akt ∣ Fold, and G-CSF→p-STAT5 ∣ Fold (R = −0.01), suggesting that these nodes are measuring different biology and might be combined to produce a multivariate model with higher association with AML response to standard induction therapy.
Finally, comparing the expression level of a receptor tyrosine kinase and the modulated downstream signaling readout for that receptor shows that the signaling readouts provide independent information. In fact, comparison of FLT3R expression levels, regardless of mutational status, with the corresponding ligand-activated pathway readouts showed only moderate correlation with FLT3L→p-S6 ∣ Fold (R = 0.44) and FLT3L→p-Akt ∣ Fold (R = 0.16), whereas both receptor levels and downstream signaling nodes stratified for response. Similar results were observed for c-Kit expression levels and SCF-induced signaling [SCF→p-Akt ∣ Fold (R = 0.59) and SCF→p-ERK ∣ Fold (R = 0.29)].
Association between multiple node/metric and clinical response—rule-based multivariate analysis
The rule-based method was applied to all node/metrics with an AUC of ≥0.66 to assess whether combinations of two or three might provide superior stratification to individual nodes of interest for classifier development in future studies. This analysis suggests that some node combinations have better sensitivity/specificity in distinguishing CRs from NRs. For example, when considered independently, SCF→p-ERK ∣ Fold (<0.17) correctly classifies all CRs as does IL-27→p-STAT3 ∣ Total (<1.1; Fig. 3C), but 17 and 8 NRs are incorrectly classified, respectively (i.e., have high sensitivity but low specificity as single node classifiers). Combining the two nodes resulted in a two-dimensional region that retained correct classification of all CRs and misclassified fewer NRs (n = 6; i.e., same sensitivity but increased specificity). When used in combination with a third node, as shown in Fig. 3D, all CRs are correctly classified and the number of misclassified NRs (n = 2) is further reduced.
Nodes advanced for additional training into study 2
Nodes from the first training study were included in the second training study if they met at least one of the following criteria: (a) were stratifying (AUCROC ≥ 0.66; P ≤ 0.05) and/or (b) exhibited good reproducibility between replicate assays (R ≥ 0.8). Based on these criteria, 87 of 147 nodes from study 1 were advanced into the second study. Three additional nodes for which a new assay had been developed after completion of study 1 were included in the second study: ara-C/daunorubicin mediated apoptosis and expression of MDR1 and MRP1 drug transporters. Each node was assessed for multiple metrics (e.g., basal, fold, and total), leading to a total of 182 node/metrics.
Study 2
This study evaluated the reduced set of node/metrics (n = 182) on BMMC samples (study 1 used peripheral blood as source of blast cells) collected at an independent center, from a larger patient pool, with different clinical and demographic patient characteristics that were more representative of the overall AML patient population compared with the first study sample set.
Patient and sample characteristics
Eighty-eight of the 134 cryopreserved AML BMMC samples in the study were evaluable after thawing. In contrast to the patient characteristics for study 1, the patient characteristics in study 2 were representative of the U.S. AML patient population and response rates except for the age distribution (Table 1). As expected, age, cytogenetic group, and secondary AML were statistically associated with clinical response to induction therapy. Due to a nonuniform sample falloff, the distribution of clinical characteristics varied across individual node/metrics.
Association between node/metric and clinical response—univariate
Univariate analysis, unadjusted for multiple testing, was done. All 182 node/metrics were tested for their ability to classify patients by clinical response to induction therapy. A total of 17 node/metrics were stratifying (AUCROC ≥ 0.66; P ≤ 0.05 on either Student's t test or Wilcoxon test; Supplementary Table S6). This number of nodes was lower than expected based on the results from the first training study, but higher than expected by chance (Supplementary Fig. S2B). Notably, 10 nodes overlapped with study 1 and represented the same three broad groups of biology interrogated, indicating that the CCG signaling pathways, phosphatase activity, and apoptosis pathways were important in predicting response to induction chemotherapy. We hypothesized that the lower number of classifying node/metrics observed in this patient sample set compared with study 1 was a consequence of differences in demographic and baseline clinical characteristics between the two studies (Table 1). To understand the potential differences between CR and NR donors within clinical subgroups, additional analysis was done by incorporating clinical covariates.
Nodes associated with clinical response in patient subsets as defined by clinical covariates
Age, performance status, diagnosis of secondary AML, and cytogenetic analysis determined at diagnosis are generally recognized as the most valuable prognostic factors in AML (7, 51). Therefore, as expected, these parameters were associated with response to induction therapy in our sample set (Table 1).
Age as a covariate
Age was incorporated into the analysis in two ways. First, it was used as a dichotomous variable. Analysis of the older patient cohort samples (≥60 years) revealed unique node/metrics that classified patients for response to induction therapy (Table 2B; Supplementary Table S7A). These included FLT3L→p-Akt ∣ Fold (0.85) and IL-27→p-STAT3 ∣ Fold (0.83). Thirteen node/metrics were found to stratify patients for response to induction therapy in the younger patient group (<60 years; Table 2B; Supplementary Table S7B). Notably, eight of these nodes were also found in study 1 (Table 2; Supplementary Table S7), including IFNα→p-STAT1 ∣ Fold (0.75 versus 0.75), IL-27→p-STAT3 ∣ Total (0.9 versus 0.83), and etoposide→(p-Chk2−,c-PARP+) ∣ Quad (0.81 versus 0.72). As found in study 1, where most patients were <60 years of age, classifying nodes were from the CCG biological category, including the JAK/STAT and CREB signaling pathways. Intact apoptotic machinery was again found to predict response to induction chemotherapy in this group. The combination of age, as a clinical variable, with certain node/metrics (e.g., IL-27→p-STAT3 ∣ Fold) increased the predictive value of either age or the node/metric itself (Fig. 4). Importantly, these data show the ability of SCNP to identify proteomic profiles that improve on age as a clinical prognostic indicator for clinical response.
Presence or absence of secondary AML
Univariate analysis of secondary AML revealed stratifying nodes from pathways overlapping with those found in the older population (Supplementary Tables S7 and S8), suggesting that in this sample set, age at diagnosis might be considered as a surrogate marker for a different disease biology. In contrast, no correlation between age and response to therapy was found when age was examined as a variable across the secondary AML sample subset. This finding suggests that the underlying biology of secondary AML is different from that of de novo AML and that age is not prognostic for response in the secondary AML patient subset.
Cytogenetics
All patient samples with a favorable cytogenetic grouping had a CR to induction chemotherapy in this study. Incorporation of the cytogenetic group as a covariate for patients with intermediate and high-risk cytogenetics revealed several node/metrics (n = 14) that significantly added to the predictive value of the cytogenetic group itself and overlapped with nodes observed in other univariate analyses. These included members of the CCG group [e.g., IL-6→p-STAT5 ∣ Fold (0.98) and IL-27→p-STAT3 ∣ Fold (0.81)] and the apoptosis group [e.g., ara-C/daunorubicin→(p-Chk2−,c-PARP−) ∣ Quad (0.74)].
FLT3 mutational status
As expected, FLT3R mutational status was not predictive of response to induction therapy in this data set (P values in Table 1).
Discussion
The two studies reported here show that characterization of intracellular pathway biology in AML from individual patients using modulated SCNP can be performed with high technical accuracy and reproducibility. Furthermore, this characterization was associated with response to AML induction therapy and distinct from other known prognostic factors (such as age, secondary AML, and cytogenetics).
The data presented are from two independent, sequentially tested cryopreserved AML sample sets obtained from the leukemic cell banks of two distinct cancer centers: PMH/UHN and MDACC. The sets differ substantially in sample number, source of leukemic cells, and patient clinical characteristics. In the first study, PBMCs were collected from a relatively homogeneous population of predominantly female patients <60 years, enriched for patients whose disease did not respond to standard induction chemotherapy. By contrast, the second training study included 88 evaluable BMMC AML samples obtained from a more heterogeneous group of patients with the expected (for age) rate of response to cytarabine-based induction therapy. In both sets, there were few samples from older patients responsive to induction chemotherapy, thus limiting the strength of observations for this patient subset. However, the second sample set was more representative of the general AML population, so additional analysis by clinical characteristics was possible. Previous studies have shown that protein levels in AML cells do not seem to exhibit biologically relevant differences between specimen sources (13) and clinical outcome seems to be independent of cytarabine dose (100 mg/m2 to 3 g/m2; ref. 52). As such, interpretation of the SCNP analyses was hypothesized to be independent of source of leukemic blasts and cytarabine dose.
Despite the above study limitations, important observations can be made. First, the SCNP assay shows the level of reproducibility needed for clinical application. Second, univariate analysis in the first study identified 58 of 304 statistically significant node/metrics (AUCROC ≥ 0.66; P ≤ 0.05) associated with clinical response to induction therapy. These node/metrics included G-CSF–induced p-STAT3 and p-STAT5, previously shown to be potentiated in AML (4), and reported here for the first time: IL-27–, IL-10–, and IL-6–mediated p-STAT1, p-STAT3, and p-STAT5. Importantly, apoptosis activated by both etoposide and ara-C/daunorubicin was shown to stratify patients by clinical outcome in both studies (Table 2). Third, the limitations of the first study in terms of sample size and skewed baseline disease characteristics preceded a second study performed with a larger sample set, more representative of the general AML population (and therefore more heterogeneous in terms of baseline disease characteristics). Analysis of the data suggests that differences in baseline characteristics of donors in the two studies played a significant role in the differences observed in the stratifying node/metrics. However, similar trends for some of the stratifying node/metrics (such as IL-27–mediated p-STAT1 and p-STAT3 signaling and etoposide-mediated cleaved PARP) were observed when clinically similar subsets of patients, although small, were compared. Another important observation that emerged from this second study was the ability of SCNP assays to reveal different pathways that correlated with patient outcome within patient subgroups defined by clinical prognostic characteristics such as age, cytogenetics, and presence or absence of secondary leukemia (Supplementary Tables S7 and S8). Specifically, in patients younger than 60 years, intact communication between a DNA damage response and the apoptotic machinery after in vitro exposure to chemotherapeutic agents emerged as an important biological characteristic that identified CR samples. By contrast, for patients >60 years or with secondary AML, lack of response to induction chemotherapy was associated with increased FLT3L-induced p-Akt and p-ERK. Importantly, combining age with some predictive node/metrics (such as IL-27–mediated p-STAT1 or p-STAT3) increased the AUCROC from 0.65 for age alone to 0.89 and 0.87, respectively (Fig. 4). This showed that SCNP assays add important and independent information that distinguishes AML disease biology beyond age. Finally, although univariate signaling node/metrics predicted response to induction therapy, the combination of independently predictive node/metrics resulted in improved classifier performance. Future studies will continue this development work with a more extensive multivariate modeling exercise using a variety of techniques including logistic regression and decision trees (random forests) in combination with bootstrapping to lock down a robust classifier predictive of response to induction therapy to be validated on an independent patient sample set. Additional studies are in progress to compare SCNP analyses conducted on cryopreserved versus fresh, fractionated AML samples. The latter conforms to current clinical practice in which timely generation of information from a diagnostic AML sample is necessary for immediate disease management decisions.
In summary, the data show that AML characterization using SCNP can be performed with high technical accuracy and reproducibility to quantitatively characterize the biology of AML in individual patient samples. The results emphasize the value of a comprehensive evaluation of biologically relevant intracellular signaling pathways in AML blasts using SCNP as the basis for the development of highly predictive tests for response to therapy. Furthermore, these proteomic profiles were predictive of disease outcome in response to specific therapeutic interventions and distinct from other known prognostic factors such as age, secondary AML, and cytogenetics.
Ultimately, prospective studies with fresh samples collected from well-designed therapeutic studies in the patient ≥60-year and <60-year age groups will be required to show the clinical utility of this approach. Working with fresh samples would decrease cell viability issues after freeze/thaw and could lead to real-time predictive tools for patients, detailing which pathways are most perturbed, thus guiding therapies with inhibitors of the specific signal transduction pathways.
Disclosure of Potential Conflicts of Interest
S.M. Kornblau: consultant, Nodality, Inc.
Acknowledgments
We thank all patients who have donated samples for this investigation, Blossom Marimpietri (Nodality) for technical assistance with tables and figures, all Nodality staff for overall contribution to SCNP technology advancements, and Garry P. Nolan and David R. Parkinson for critical review of the manuscript. Special recognition is given to the late Helen Francis-Lang for sample procurement and selection.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.