Abstract
Pediatric acute lymphoblastic leukemia (ALL) is the most common pediatric malignancy, and the second leading cause of pediatric cancer–related death in developed countries. While the cure rate for newly diagnosed ALL is excellent, the genetic heterogeneity and chemoresistance of leukemia cells at relapse makes individualized curative treatment plans difficult. We hypothesize that genetic events would coalesce into a finite number of protein signatures that could guide the design of individualized therapy. Custom reverse-phase protein arrays were produced from pediatric ALL (n = 73) and normal CD34+ (n = 10) samples with 194 validated antibodies. Proteins were allocated into 31 protein functional groups (PFG) to analyze them in the context of other proteins, based on known associations from the literature. The optimal number of protein clusters was determined for each PFG. Protein networks showed distinct transition states, revealing “normal-like” and “leukemia-specific” protein patterns. Block clustering identified strong correlation between various protein clusters that formed 10 protein constellations. Patients that expressed similar recurrent combinations of constellations comprised 7 distinct signatures, correlating with risk stratification, cytogenetics, and laboratory features. Most constellations and signatures were specific for T-cell ALL or pre-B-cell ALL; however, some constellations showed significant overlap. Several signatures were associated with Hispanic ethnicity, suggesting that ethnic pathophysiologic differences likely exist. In addition, some constellations were enriched for “normal-like” protein clusters, whereas others had exclusively “leukemia-specific” patterns.
Implications: Recognition of proteins that have universally altered expression, together with proteins that are specific for a given signature, suggests targets for directed combinatorial inhibition or replacement to enable personalized therapy. Mol Cancer Res; 16(8); 1263–74. ©2018 AACR.
See related article by Hoff et al., p. 1275
Introduction
Pediatric acute lymphoblastic leukemia (ALL) is the most common form of cancer in children accounting for approximately 25% of all childhood malignancies. Despite dramatic improvements in outcome over the past few decades, with 5-year survival rates approaching 90% (1, 2), relapsed ALL remains one of the leading causes of pediatric cancer mortality and morbidity. To improve therapeutic outcome in high-risk patients and relapsed ALL, we need an improved understanding of individual molecular pathophysiology. Defining what signaling pathways and regulatory network dependencies are crucial to driving the underlying malignancy would facilitate the use of targeted therapies on an individualized basis.
High-throughput next-generation sequencing has led to an advanced understanding of the genetic heterogeneity of pediatric ALL; this in turn has led to a focus on novel therapies that target frequently mutated candidate genes (3). This research has revealed multiple recurrent genetic alterations, involving genes involved in lymphoid development, cell-cycle regulation, tumor suppression, apoptosis, lymphoid signaling, and transcriptional regulation (3, 4). However, with the exception of the BCR-ABL+ tyrosine kinase inhibitors (3), most recurrent genetic events identified to date lack therapeutic agents that specifically target the mutated proteins resulting from these genetic mutations. Furthermore, those genetic and epigenetic changes occur in a near infinite number of combinations and the physiologic consequences of combinatorial genetic mutations are largely undefined. This genetic heterogeneity makes personalized rational treatment combinations challenging.
As the molecular consequences of genetic and epigenetic events are predominantly mediated by the altered expression and function of proteins, we hypothesize that genetic heterogeneity coalesces into a more finite number of protein expression patterns, and that these protein expression patterns reveal key protein dependencies that could identify therapeutic targets. Gene expression profiling (GEP) has revealed recurrent patterns of gene expression, but has the limitation that messenger RNA transcript expression correlates with protein abundance for less than 50% of genes (5–11). GEP also does not reflect posttranslational modifications (PTM) and protein activation states. As proteins function in networks and functionally related pathways, rather than on individual basis, we further hypothesize that analyzing proteins using a network-based approach should identify crucial recurrent protein expression patterns that define subpopulations of pediatric ALL. We therefore set out to define unique protein expression patterns across pediatric ALL patients with the goal of informing risk classification and suggesting novel combinational therapy.
Methods
Patient population
Peripheral blood (PB) mononuclear cells were collected from 73 ALL patients (67 newly diagnosed and 6 relapsed pediatric ALL) that were evaluated at the Texas Children's Hospital (TXCH) between July 2010 and June 2015. Samples were collected prior to induction therapy and in accordance with Institutional Review Board (IRB)-approved policies. Informed written consent was obtained in accordance with the Declaration of Helsinki, and applicable local and state laws. Demographics are described in Table 1. Sixteen patients were diagnosed with T-cell ALL and 57 with pre-B cell ALL. A high percentage were of Hispanic ethnicity (N = 45/73, 62%). Single-nucleotide polymorphisms (SNP) were determined for 54 patients to verify their genetic ancestry. Patients were stratified into risk groups according to the Children's Oncology Group (COG; ref. 12) and were treated under a variety of COG protocols (Supplementary Table S1). All but six patients achieved complete remission (CR), and only four relapsed. Sixteen patients underwent stem cell transplantation and 63 (86%) were alive at the end of follow-up (28 to 350 weeks). Mutation analysis was restricted to that performed as part of routine clinical care and included analysis of Mll, Cdkn2a, Igh, Tcf3, Etv6, and Runx1. This mutation information was available for all but two patients.
Patient characteristics of the 73 pediatric ALL patients
Characteristics . | N, % . |
---|---|
Number of cases | 73 |
ALL subtype | |
Pre-B ALL | 57 (78) |
T-ALL | 16 (22) |
Age, y, median (range) | 7.3 (0.2–18.0) |
Gender | |
Male | 40 (55) |
Female | 33 (45) |
Declared Ethnicity | |
Caucasian | 61 (84) |
Hispanic | 44 (72) |
Non-Hispanic | 17 (28) |
Black American | 6 (8) |
Asian | 4 (5) |
Mixed | 2 (3) |
Other | 1 (1) |
"SNP" Ethnicity | |
European | 9 (12) |
African | 6 (8) |
Hispanic/American Indian | 38 (52) |
Asian | 1 (1) |
Not done | 19 (26) |
Cytogenetics | |
Favorable | 15 (21) |
Intermediate | 42 (58) |
Unfavorable | 15 (21) |
Unknown | 1 (1) |
Risk group | |
Low risk | 4 (5) |
Standard/intermediate risk | 29 (40) |
High/very high risk | 40 (55) |
CNS status | |
CNS-1 | 46 (63) |
CNS-2 | 20 (27) |
CNS-3 | 6 (8) |
Unknown | 2 (3) |
Response | |
Complete remission | 67 (92) |
Resistant | 4 (5) |
Fail | 2 (3) |
Alive | 63 (86) |
Characteristics . | N, % . |
---|---|
Number of cases | 73 |
ALL subtype | |
Pre-B ALL | 57 (78) |
T-ALL | 16 (22) |
Age, y, median (range) | 7.3 (0.2–18.0) |
Gender | |
Male | 40 (55) |
Female | 33 (45) |
Declared Ethnicity | |
Caucasian | 61 (84) |
Hispanic | 44 (72) |
Non-Hispanic | 17 (28) |
Black American | 6 (8) |
Asian | 4 (5) |
Mixed | 2 (3) |
Other | 1 (1) |
"SNP" Ethnicity | |
European | 9 (12) |
African | 6 (8) |
Hispanic/American Indian | 38 (52) |
Asian | 1 (1) |
Not done | 19 (26) |
Cytogenetics | |
Favorable | 15 (21) |
Intermediate | 42 (58) |
Unfavorable | 15 (21) |
Unknown | 1 (1) |
Risk group | |
Low risk | 4 (5) |
Standard/intermediate risk | 29 (40) |
High/very high risk | 40 (55) |
CNS status | |
CNS-1 | 46 (63) |
CNS-2 | 20 (27) |
CNS-3 | 6 (8) |
Unknown | 2 (3) |
Response | |
Complete remission | 67 (92) |
Resistant | 4 (5) |
Fail | 2 (3) |
Alive | 63 (86) |
NOTE: Cytogenetic aberrations were classified into favorable, intermediate, and unfavorable cytogenetics. Favorable: hyperdiploid, diploid and t(12:21) EVT6/RUNX1 translocation; unfavorable: 11q23 rearrangement, hypodiploid, t(9;22) BCR/ABL1 translocation, 5q deletion. Patients that were not classified as favorable or unfavorable were defined as having intermediate cytogenetics. Central nervous system (CNS) involvement was categorized into three groups according to the COG standard. CNS-1: no blasts in the cerebrospinal fluid (CSF), CNS-2: <5% blasts in the CSF with or without red blood cells, CNS-3: >5% blasts in CFS. Risk group stratification was done according to the AALL protocols.
RPPA methodology
The antibody-based high-throughput reverse-phase protein arrays (RPPA) methodology was performed on 73 samples from pediatric patients with ALL, 10 cryopreserved CD34+ normal bone marrow samples (AllCells) and 95 leukemic cell lines samples. Cell lines were obtained from the ATCC and different laboratories and were tested for mycoplasma using the mycoplasma PCR detection kit (Applied Biological Materials Inc. catalog no. G238). Patient samples were processed into RPPA lysates on the day of collection and no samples were prepared from cryopreserved samples. The methodology and validation of the technique are fully described in previous publications (13–15). Briefly, the whole-cell lysate protein preparations were made from the mononuclear cell fraction of ficolled peripheral blood and normalized to a concentration of 1 × 104 cells/μL. Samples were printed in five (1:2) serial dilutions onto slides along with normalization and expression controls. Slides were probed with 194 strictly validated primary antibodies and a secondary antibody to amplify the signal, and finally a stable dye to precipitate protein signal (16). This included antibodies against 149 different proteins along with 36 antibodies targeting phosphorylation sites, six targeting cleaved forms of caspase, NOTCH1, and PARP1, and three targeting Histone 3 methylation sites. A “Rosetta Stone” table of manufacturer, antibody name, and primary and secondary antibody dilutions can be found in Supplementary Table S2. The stained slides were analyzed using Microvigene software (Vigene Tech) to produce quantified data.
Nomenclature protein and antibody names
As neither the HUGO (17), HUPO (18), or MiMI (19) naming systems account for PTM, we used a nomenclature in which the HUGO gene symbol is followed by a period, then the type of PTM, “p” for phosphorylated, “cl” for cleaved or “Me” for methylation, followed by the letter code for the affected amino acid and its sequence position. For example, AKT1.pT308 is AKT1 phosphorylated on Threonine at position 308. Placing the PTM after the protein name enables alphabetical sorting and inclusion of the affected site.
Data normalization and processing
SuperCurve algorithms were used to generate a single value from the five serial dilutions (20). Loading control (21) and topographical normalization (22) procedures were performed to account for protein concentration and background staining variations. As all samples had replicates, the average expression level of the replicates was used as a single expression level. All protein expression levels were shifted relative to the median of the normal CD34+ bone marrow samples.
Computational analysis
The computational analysis was done using the “meta-Galaxy” analysis (Supplementary Fig. S1), because we had previously seen in adult acute myeloid leukemia (AML) that this approach, which analyzes proteins in the context of functionally related proteins, obtained more clinically interesting patient groups compared with the traditional approaches (unpublished data). In contrast to the traditional unsupervised hierarchical clustering that ignores all the known relationships between proteins, and has the additional disadvantage of weighing each component equally, we first divided the 194 proteins in 31 functionally related protein groups, defined as a “Protein Functional Group” (PFG). This allocation into functional-related groups was done based on their known function or pathway membership from the existing literature or based on strong associations within the dataset (e.g., BRCA2 to the “Cell Cycle” PFG and DDX17 to the “Ribosome” PFG). Because proteins could have multiple functions, proteins could belong to multiple PFG. The proteins involved in each PFG are listed in Supplementary Table S3.
To identify whether the subsets of cases with similar (correlated) expression of core member proteins within each PFG did exist, a combination of Progeny Clustering (23) (a bootstrapping and stability based method for selecting cluster numbers) in combination with k-means (24) (generation of protein clusters) was used. Subsets of patients were identified on the basis of their relative protein expression similarities (i.e., Euclidean distance), which were defined as a “protein cluster.” The optimal number of protein clusters for each PFG was determined using the clustering solution stability scores. For some PFGs, an alternative number of clusters was chosen or small clusters were merged into the closest group to make more biologically relevant clusters. Linear discriminant analysis was performed to determine which of the protein clusters was statistically most similar to the normal CD34+ samples (25). This protein cluster was then set as cluster 1 and was positioned to the far left. Principal component analysis (PCA) (26) was used to visualize the distribution of the protein clusters relative to that of normal CD34+ bone marrow samples. Associations between protein clusters and clinical/laboratory features were assessed using the Fisher exact test for categorical variables and the Kruskal–Wallis test by ranks for continuous variables. Survival curves were generated using the Kaplan–Meier method. Protein networks were constructed from known protein associations that were obtained from the STRING literature database (combined score > 0.9; ref. 27) in combination with computationally reconstructed interactions from the RPPA data using graphical lasso (28) and StARS (ref. 29; for model selection based on stability). As the STRING database does not consider PTMs, the protein names were used to query literature-based interactions for PTM sites.
Next, we rebuilt the overall picture by combining the individual protein clusters into one binary matrix to assess whether we could identify patterns of protein clusters from various PFG that recurrently cooccurred together. This matrix indicated the protein cluster membership for each patient in all PFG; 1 if a patient was a member of protein cluster, 0 if not a member. Block clustering (30) was performed to search for strong recurrent correlations between protein clusters from various PFG that were defined as a “protein constellation.” A group of patients that expressed similar patterns of protein constellations was defined as a “protein signature.” The optimal number of protein constellations, that formed protein signatures, was obtained by selecting the combination that generated the largest sum of the squared difference between the expected and observed values, divided by the expected value. The expected value was defined as the product of cluster membership within that constellation, divided by the frequency of patients that fell within a given signature. Correlations between signatures and clinical features/outcomes were assessed similarly as for the individual PFG. Lists of proteins that were significantly over or under expressed relative to the normal CD34+ cells were generated for each constellation and for each signature using the Wilcoxon signed-rank test with a false discovery rate adjusted P value (P < 0.01). The most discriminative proteins that allow classification into the signatures were selected using Random Forest (31). All the statistical tests and plots were generated in R (Version 0.99.484 –2009-2015 RStudio, Inc.). Networks were generated in Cytoscape (Version 3.3.0; ref. 32).
Results
Existence of “normal-like” and “leukemia-specific” protein patterns
To characterize heterogeneity in protein expression between pediatric ALL patients, we started our analysis by evaluating proteins in the context of their own PFG. Therefore, the Progeny Clustering algorithm was applied that enabled identification of an optimal number of “protein clusters”: a subset of cases with similar (correlated) expression of core protein components of a PFG. This number of protein clusters ranged from 3 to 5 clusters for each PFG (Fig. 1A). The measure of cluster stability was based on the cooccurrence probability matrix (Supplementary Fig. S2). Overall, clusters showed high stability and reproducibility with scores of 0.6–0.9. No confounding variables that affected the clustering analysis were found for the processing time of the samples, or for the date of collection (Supplementary Fig. S3). Next, PCA was performed to determine whether the protein clusters were similar to the normal CD34+ samples, or if they were sufficiently dissimilar to be specific to a leukemic state, based on their graphic distribution on the PCA plot in comparison with the normal CD34+ samples. Most PFG (N = 23) had at least one cluster with expression similar to the normal CD34+ samples (Fig. 1A, checkered pattern). In contrast, leukemia-specific clusters, lacking overlap with the normal CD34+ cells, were observed for all 31 PFG (Fig. 1A, solid fill) with 8 PFGs (cell cycle, differentiation, MEK, PKC, STP upstream, T-cell, transcription, and WNT signaling) having only leukemia-specific clusters. For 8 of the PFGs, we could identify more than 1 “normal-like” pattern.
The optimal number of protein clusters for all protein functional groups. A, The optimal number of protein clusters that was identified for each of the 31 PFG is illustrated. Protein patterns that showed sufficient overlap with the normal CD34+ samples on the PCA plot were assigned as “normal-like" protein clusters and are shown as checkered boxes. Protein clusters that were sufficiently dissimilar from the normal CD34+ were assigned as “leukemia-specific” and are shown as solid boxes. B, Representation of pediatric ALL protein clusters that were mimicked by at least one of the leukemic cell lines. Green ticks indicate that a protein cluster had a cell line with a protein expression pattern equivalent. The red crosses indicate that none of the cell lines were found to express a comparable protein expression pattern.
The optimal number of protein clusters for all protein functional groups. A, The optimal number of protein clusters that was identified for each of the 31 PFG is illustrated. Protein patterns that showed sufficient overlap with the normal CD34+ samples on the PCA plot were assigned as “normal-like" protein clusters and are shown as checkered boxes. Protein clusters that were sufficiently dissimilar from the normal CD34+ were assigned as “leukemia-specific” and are shown as solid boxes. B, Representation of pediatric ALL protein clusters that were mimicked by at least one of the leukemic cell lines. Green ticks indicate that a protein cluster had a cell line with a protein expression pattern equivalent. The red crosses indicate that none of the cell lines were found to express a comparable protein expression pattern.
Protein functional groups reveal different protein activity states
To visualize interactions between core member proteins of each PFG and other probed proteins in the dataset, protein networks were generated. Networks were built by integrating previously known protein interactions from the literature and strong correlations in the dataset. The median expression for each protein cluster was then calculated relative to the normal CD34+ cells and overlaid onto the networks to reveal the overall differences in expression and activation associated with each protein cluster. For instance, for the “Friend Leukemia Virus Integration 1” (FLI1) that was formed by 5 core protein members (FLI1, NCL, NPM1, STMN1, and WTAP), we were able to recognize 5 distinct protein clusters (C1, C2, C3, C4 and C5; Fig. 2A). By convention, the protein expression levels in C1 were statistically the closest to normal and showed most proteins with expression similar to the normal CD34+ samples (Fig. 2B). The greatest variation between the protein clusters was observed in the expression of key protein STMN1; progressively higher expression in C2 and C4 and increasingly lower expression in C3 and C5 (Fig. 2C). Another example that showed the concept of different transition states was the “Apoptosis Occurring” PFG. Here we could recognize 3 different protein clusters (C1, C2, and C3; Figures available online at http://qutublab.rice.edu/pediatric-all/ApopOccur/). Increased evidence of apoptosis activation, in the form of cleavage of Parp1, caspase-3, and caspase-7 was evident in C2 and C3, representing two apoptosis “on” states.
Relative protein expression levels for the proteins involved in “Friend Leukemia Virus Integration 1” (FLI1) protein functional group. A, This heatmap shows the relative protein expression levels for the 5 core member proteins of the “FLI1” PFG: STMN1, FLI1, NPM1.3542, WTAP and NCL. The Progeny Clustering algorithm (coupled with k-means) was performed and identified an optimal number of 5 protein clusters (C1, C2, C3, C4, C5). The colors reflect the median expression levels relative to the normal CD34+ samples. Proteins expressed greatly below normal are shown as dark blue, and proteins expressed significantly above normal are shown in dark red (maroon). Proteins within the range of the normal cells are colorized in green (extended up to yellow and down to aqua). Each column represents a single patient. The annotation bar shows patient membership for the different ALL subtypes [pre-B cell (yellow) and T-cell (magenta)] and for the 5 defined protein clusters [C1 (red), C2 (magenta), C3 (yellow), C4 (light green) and C5 (dark green)]. B, Principal component analysis (PCA) visualized the global distribution of the patients in their assigned protein cluster relative to the normal CD34+ samples. From the PCA partial similarity between normal CD34+ cells [black ] and C1 [red
] was observed, while the leukemia specificity of C2 [magenta
], C3 [yellow
], C4 [light green
] and C5 [dark green
] was demonstrated by the lack of overlap with the normal CD34+ cells. Each plotted dot represents one patient. C, Protein networks show interactions between the 5 core protein members (large nodes) and associated proteins (small nodes). Colors reflect the relative median protein expression within that protein cluster; ranged from high (maroon) to low (dark blue). Dotted (…˙) lines indicate known associations from the literature, dashed lines (- - -) indicate interactions based on strong correlation in the dataset and solid lines (—) indicate interactions both seen in the literature and our dataset. Arrows show transition from the most normal state C1 to the more “on”-states C2 and C4 and the more “off” states in C3 and C5 relative to C1.
Relative protein expression levels for the proteins involved in “Friend Leukemia Virus Integration 1” (FLI1) protein functional group. A, This heatmap shows the relative protein expression levels for the 5 core member proteins of the “FLI1” PFG: STMN1, FLI1, NPM1.3542, WTAP and NCL. The Progeny Clustering algorithm (coupled with k-means) was performed and identified an optimal number of 5 protein clusters (C1, C2, C3, C4, C5). The colors reflect the median expression levels relative to the normal CD34+ samples. Proteins expressed greatly below normal are shown as dark blue, and proteins expressed significantly above normal are shown in dark red (maroon). Proteins within the range of the normal cells are colorized in green (extended up to yellow and down to aqua). Each column represents a single patient. The annotation bar shows patient membership for the different ALL subtypes [pre-B cell (yellow) and T-cell (magenta)] and for the 5 defined protein clusters [C1 (red), C2 (magenta), C3 (yellow), C4 (light green) and C5 (dark green)]. B, Principal component analysis (PCA) visualized the global distribution of the patients in their assigned protein cluster relative to the normal CD34+ samples. From the PCA partial similarity between normal CD34+ cells [black ] and C1 [red
] was observed, while the leukemia specificity of C2 [magenta
], C3 [yellow
], C4 [light green
] and C5 [dark green
] was demonstrated by the lack of overlap with the normal CD34+ cells. Each plotted dot represents one patient. C, Protein networks show interactions between the 5 core protein members (large nodes) and associated proteins (small nodes). Colors reflect the relative median protein expression within that protein cluster; ranged from high (maroon) to low (dark blue). Dotted (…˙) lines indicate known associations from the literature, dashed lines (- - -) indicate interactions based on strong correlation in the dataset and solid lines (—) indicate interactions both seen in the literature and our dataset. Arrows show transition from the most normal state C1 to the more “on”-states C2 and C4 and the more “off” states in C3 and C5 relative to C1.
Protein constellations express recurrent patterns of protein expression
Although traditional approaches that cluster patients directly by taking all proteins together with unsupervised hierarchical clustering could clearly separate pediatric ALL patients from the healthy subjects (Supplementary Fig. S4), we supposed that we would find better, more robust patterning within the pediatric ALL samples if we created smaller subsets of proteins based on known functional relationships, and then built up the overall picture from these individual building blocks (Supplementary Fig. S5). Therefore, we developed a novel approach that defined patient signatures by looking for recurrences in expression patterns within PFG and from there built higher order structures by performing hierarchical clustering of those smaller patterns.
As described, protein clusters were first defined within each PFG, which resulted in a total of 114 protein clusters for the 31 PFG. As each patient was represented by one of the protein clusters of each individual PFG, each patient was a member of 31 out of the 114 protein clusters. Second, all protein clusters were compiled into a single binary matrix, which we called a “meta-Galaxy” (Fig. 3A). Block clustering was conducted to search for recurrent associations between various protein clusters, which were defined as a “protein constellation” (Fig. 3A, horizontal). Patients that showed recurrent patterns of protein constellations were together defined as “protein signature” (Fig. 3A, vertical). An optimization calculation was performed to determine the optimal number of constellations and signatures. This was determined by selecting the matrix where the squared sum of the difference between the observed and expected values of each combination of signatures and constellations, divided by the expected value, was maximal. This suggested the presence of 10 protein constellations and 7 protein signatures. For instance, constellation 4, which was horizontally formed by 16 protein clusters, was strongly associated with patients that formed signature 7. The expected occurrence in this constellation was 9% for each signature, based on the presence of 107 single blue points that indicated protein cluster membership out of 1,168 (constellation 4; 107 single blue points vs. a potential of 16 × 73 patients = 1168, 107/1168 = 0.09). Within this constellation, signature 7 showed an observed occurrence significantly above expectation of 94% (64 of the 68 points were blue, vs. an expected number of 0.09 × 68 = 6). In contrast, none of the patients in signature 2 had a membership for any of the protein clusters within this constellation (0/128 blue points; P < 0.001). Likewise, constellation 9, which was formed by 3 protein clusters, had an expected presence rate of 62% [135 blue points vs. a potential of 219 points (3 × 73 patients)]. Within this constellation, signature 1 had an observed presence rate above expected of 94% [34 vs. 36 (3 × 12) blue points] and signature 7 had an observed presence rate below expected of 0% [0 vs. a potential of 12 (3 × 4) points; P < 0.001)]. A list of the protein clusters in each constellation is shown in Supplementary Table S4. An example of the optimization calculation is shown in Supplementary Fig. S6.
"Meta-Galaxy" analysis identifies strong correlation between protein clusters from various protein functional groups. A, Block clustering identified the existence of 10 protein constellations (horizontally) and 7 protein signatures (vertically). Each column represents a single patient and is positive (blue) for 31 out of the 114 protein clusters. Each row represents a single protein functional group cluster. The annotation bar shows a clear division in ALL type [pre-T cell (magenta), pre-B cell (yellow)], and shows the patient characteristics including; gender, age, treatment risk group, CNS status, cytogenetics, declared ethnicity, SNP verified ethnicity, CDKN2A mutation status, relapse, achievement of complete remission and the vital status of the patient. B, Block clustering limited to the T-cell ALL samples enabled recognition of 6 protein constellations (horizontally) and 3 protein signatures (vertically). The annotation bar shows patient characteristics for ethnicity and suggests ethnicity-associated constellations.
"Meta-Galaxy" analysis identifies strong correlation between protein clusters from various protein functional groups. A, Block clustering identified the existence of 10 protein constellations (horizontally) and 7 protein signatures (vertically). Each column represents a single patient and is positive (blue) for 31 out of the 114 protein clusters. Each row represents a single protein functional group cluster. The annotation bar shows a clear division in ALL type [pre-T cell (magenta), pre-B cell (yellow)], and shows the patient characteristics including; gender, age, treatment risk group, CNS status, cytogenetics, declared ethnicity, SNP verified ethnicity, CDKN2A mutation status, relapse, achievement of complete remission and the vital status of the patient. B, Block clustering limited to the T-cell ALL samples enabled recognition of 6 protein constellations (horizontally) and 3 protein signatures (vertically). The annotation bar shows patient characteristics for ethnicity and suggests ethnicity-associated constellations.
Most constellations were associated with a single ALL subtype, with constellations 3 and 5 only being found in T-cell ALL and constellations 2, 4, 6, 8, and 10 being exclusive to pre-B-cell ALL. However, constellations 1 and 9 showed some overlap between pre-B cell ALL and T-cell ALL, suggesting shared protein deregulation. A clear distinction was observed between the T-cell–specific signature 1 (N = 12/12, 100%), the pre-B ALL dominant signatures 2, 3, and 4 (N = 19/23, 83%) and pre-B-ALL exclusive signatures 5, 6, and 7 (N = 38/38, 100%). Because the majority of T-cell ALL cases were within signature 1, we conducted a separate analysis of only T-cell ALL samples. As shown in Fig. 3B, we observed three T-cell signatures based on 6 constellations. A list of the protein clusters in each constellation is shown in Supplementary Table S5. A similar analysis was performed using only B-cell ALL cases, but this was not different from what was seen in signatures 2–8 (Supplementary Fig. S7; Supplementary Table S6).
Because we identified protein clusters that showed sufficient overlap with our healthy CD34+ cells to be defined as normal-like protein clusters, we were then interested in whether constellations were enriched or depleted for those clusters. Interestingly, we found constellations that showed enrichment for normal-like patterns (constellation 1, 8, 9, and 10) and constellations that had exclusively leukemia-specific patterns (constellation 2, 3, and 5; P = 0.011).
Protein patterns correlate with outcomes and clinical and laboratory features
Typical of pediatric ALL, this cohort was characterized by a high CR rate (N = 67, 93%) and a low therapy resistance rate (N = 4, 5%) that in combination with a low relapse rate (N = 4, 5%) resulted in a high survival (N = 63, 86%). Given the paucity of events, signatures did not show statistically significant correlation with overall survival (OS) (Fig. 4A) or disease-free survival (DFS; Fig. 4B), which was defined as having an event (relapse or death) postinduction that led to CR. However, 3 of the 4 relapse cases were within signature 6. Univariate Cox proportional-hazard analysis showed no other relationships between the survival probability and any of the collected patients' features (Supplementary Table S7).
Survival analysis for the identified ALL protein signatures and the protein summary plot for the constellations associated with the Hispanic ethnicity within the T-ALL patients. The Kaplan-Meier curves for overall survival (A) and disease-free survival (DFS) (B) were generated according to the 7 ALL protein signatures. Line colors match with the colored annotation bar on the “meta-Galaxy”. C, Proteins with significantly higher or significantly lower protein expression levels relative to normal CD34+ cells within T-ALL constellation 3 (enriched for Hispanic ethnicity) and constellation 5 (enriched for non-Hispanic ethnicity) are shown. Proteins in constellation 3 were predominantly involved in PFG “Cell Cycle”, “FLI1” and “IAP-Apoptosis” and proteins in constellation 5 were involved PFG “BH3 Apoptosis”, “Cell Cycle”, “FLI1”, “IAP-Apoptosis” and “MEK”. Colors reflect the relative median expression within that specific constellation, ranged from the lowest (dark blue) to relatively normal (cyan-green-yellow) to the highest (maroon) expression.
Survival analysis for the identified ALL protein signatures and the protein summary plot for the constellations associated with the Hispanic ethnicity within the T-ALL patients. The Kaplan-Meier curves for overall survival (A) and disease-free survival (DFS) (B) were generated according to the 7 ALL protein signatures. Line colors match with the colored annotation bar on the “meta-Galaxy”. C, Proteins with significantly higher or significantly lower protein expression levels relative to normal CD34+ cells within T-ALL constellation 3 (enriched for Hispanic ethnicity) and constellation 5 (enriched for non-Hispanic ethnicity) are shown. Proteins in constellation 3 were predominantly involved in PFG “Cell Cycle”, “FLI1” and “IAP-Apoptosis” and proteins in constellation 5 were involved PFG “BH3 Apoptosis”, “Cell Cycle”, “FLI1”, “IAP-Apoptosis” and “MEK”. Colors reflect the relative median expression within that specific constellation, ranged from the lowest (dark blue) to relatively normal (cyan-green-yellow) to the highest (maroon) expression.
On the other hand, signatures were significantly associated with patient demographics and laboratory variables (Table 2). Favorable cytogenetics were overrepresented in signature 5 and 7, and intermediate cytogenetics were over represented in signature 1 and 4 (P = 0.017). No associations were seen for single cytogenetic abnormalities, such as the frequently harbored 11q23 rearrangement. This lack of association with specific cytogenetic types again highlights the large heterogeneity among ALL patients and is likely due to various combinations of mutations. As expected for the T-cell ALL signature, CDKN2A was highly mutated (N = 9/12, 75%) compared with the overall mutation rate (N = 20/64, 31%; P = 0.007; ref. 33). A low CDKN2A mutation frequency was observed for signature 2 (N = 1/7, 14%), signature 4 (N = 1/9, 10%), and signature 7 (N = 0/2, 0%).
Demographics and laboratory features for 7 identified protein expression signatures
. | . | Total . | Protein expression signature . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Category . | Type . | Count . | Freq. . | 1 . | 2 . | 3 . | 4 . | 5 . | 6 . | 7 . | P . |
Total (%) | 73 | 100 | 12 | 27 | 19 | 11 | 16 | 8 | 5 | ||
ALL type (%) | Pre-B | 57 | 78 | 0 | 75 | 83 | 89 | 100 | 100 | 100 | <0.001 |
T | 16 | 22 | 100 | 25 | 17 | 11 | 0 | 0 | 0 | ||
White blood cell count (x k/μL) | Median | 30.5 | 397 | 59 | 118 | 12 | 21 | 66 | 7 | <0.001 | |
Peripheral blood absolute blast (k/μL) | Median | 25.1 | 319 | 41 | 99 | 9 | 12 | 56 | 1 | <0.001 | |
Peripheral blast (%) | Median | 73.9 | 86 | 70 | 83 | 53 | 57 | 78 | 30 | 0.006 | |
Lactate dehydrogenase (x U/L) | Median | 2,069 | 9,914 | 4,702 | 3,609 | 2,049 | 1,195 | 1,500 | 965 | <0.001 | |
Bilirubin (mg/dL) | Median | 0.3 | 0.6 | 0.3 | 0.2 | 0.5 | 0.2 | 0.2 | 0.4 | 0.004 | |
Fibrinogen (mg/dlL) | Median | 332 | 191 | 330 | 267 | 495 | 410 | 360 | 398 | 0.001 | |
Human Leucocyte Antigen - antigen D-related (%) | Median | 100 | 0 | 98.5 | 100 | 100 | 100 | 97.5 | 100 | <0.001 | |
Cytogenetics (%) | Favorable | 15 | 21 | 0 | 14 | 0 | 11 | 50 | 15 | 75 | 0.003 |
Intermediate | 42 | 58 | 92 | 57 | 67 | 78 | 29 | 55 | 25 | 0.021 | |
Risk group (%) | Intermediate | 12 | 16 | 67 | 13 | 17 | 0 | 7 | 0 | 25 | <0.001 |
CDKN2A mutation (%) | Yes | 20 | 21 | 75 | 14 | 25 | 11 | 33 | 22 | 0 | 0.025 |
. | . | Total . | Protein expression signature . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Category . | Type . | Count . | Freq. . | 1 . | 2 . | 3 . | 4 . | 5 . | 6 . | 7 . | P . |
Total (%) | 73 | 100 | 12 | 27 | 19 | 11 | 16 | 8 | 5 | ||
ALL type (%) | Pre-B | 57 | 78 | 0 | 75 | 83 | 89 | 100 | 100 | 100 | <0.001 |
T | 16 | 22 | 100 | 25 | 17 | 11 | 0 | 0 | 0 | ||
White blood cell count (x k/μL) | Median | 30.5 | 397 | 59 | 118 | 12 | 21 | 66 | 7 | <0.001 | |
Peripheral blood absolute blast (k/μL) | Median | 25.1 | 319 | 41 | 99 | 9 | 12 | 56 | 1 | <0.001 | |
Peripheral blast (%) | Median | 73.9 | 86 | 70 | 83 | 53 | 57 | 78 | 30 | 0.006 | |
Lactate dehydrogenase (x U/L) | Median | 2,069 | 9,914 | 4,702 | 3,609 | 2,049 | 1,195 | 1,500 | 965 | <0.001 | |
Bilirubin (mg/dL) | Median | 0.3 | 0.6 | 0.3 | 0.2 | 0.5 | 0.2 | 0.2 | 0.4 | 0.004 | |
Fibrinogen (mg/dlL) | Median | 332 | 191 | 330 | 267 | 495 | 410 | 360 | 398 | 0.001 | |
Human Leucocyte Antigen - antigen D-related (%) | Median | 100 | 0 | 98.5 | 100 | 100 | 100 | 97.5 | 100 | <0.001 | |
Cytogenetics (%) | Favorable | 15 | 21 | 0 | 14 | 0 | 11 | 50 | 15 | 75 | 0.003 |
Intermediate | 42 | 58 | 92 | 57 | 67 | 78 | 29 | 55 | 25 | 0.021 | |
Risk group (%) | Intermediate | 12 | 16 | 67 | 13 | 17 | 0 | 7 | 0 | 25 | <0.001 |
CDKN2A mutation (%) | Yes | 20 | 21 | 75 | 14 | 25 | 11 | 33 | 22 | 0 | 0.025 |
NOTE: Significant patient characteristics and ALL features are shown for the overall patient cohort as well as for each protein expression signature. Other nonsignificant variables that were checked, but which lacked association with the protein expression signatures included: age at diagnosis, sex, race, CNS status, infection, IgH gene rearrangement, TCF3 gene rearrangement, ETV6 mutation, RUNX1 mutation, the percentage of bone marrow blasts, bone marrow and peripheral blood monocytes or promyelocytes, hemoglobin, platelet count, albumin and creatinine. P values were generated using the Kruskal–Wallis test by ranks for continuous variables and the Fisher exact test for categorical variables.
Protein signatures are associated with Hispanic ethnicity
In numerous studies, Hispanic patients with pediatric acute leukemia have fared worse than Caucasians (34–37), but whether this arises from different underlying biology, or is related to socioeconomic factors is currently unknown. Our study population, drawn from an area in Southern Texas, was enriched for Hispanic patients. Pre-B-cell ALL signatures 3, 5, and 7 showed a similar proportion of Hispanic patients compared with the overall population (N = 45/73, 62%). However, both signature 2 (N = 6/8, 75%) and signature 6 (N = 17/20, 85%) were enriched for Hispanic patients, whereas Hispanic patients were underrepresented in signature 4 (N = 3/9, 33%; P = 0.021, df = 2). This imbalance in ethnic compositions was even stronger after verification by genetic ancestry mapping using SNP; two of the non-Hispanic patients in signature 2 were of African descent, also having inferior disease outcome (38). Two of the self-reported non-Hispanics in signature 6 were actually Hispanics by SNP and one patient stating Hispanic ancestry was Asian by SNP typing. This brings the total number of Hispanics in signature 6 to 18 (N = 18/20, 90%). Signature 4 had one additional Hispanic patient by SNP determination, making that signature less imbalanced (N = 4/9, 44%).
When T-cell ALL cases were considered separately, 3 signatures were present with most of the Hispanic cases in signature 1 and 2 (N = 6/8, 75%) and only two cases in signature 3 (N = 3/8, 38% following SNP analysis). Constellations 1, 2, 4 and 6 were unassociated with ethnicity, while constellation 3 was found in the Hispanic cases and constellation 5 was strongly present in the non-Hispanic cases. Notably, both constellations 3 and 5 contained protein clusters from the PFG “Cell Cycle”, “FLI1,” and “IAP-Apoptosis.” Expression summary plots are shown in Fig. 4C. The Hispanic-associated constellation 3 lacked the upregulation of CCND3, DUSP6, RB1, RB1.pS807_811 and STMN1 seen in the non-Hispanic constellation 5, but had upregulation of unphosphorylated FKHRL1 (FOXO3). Constellation 3 showed higher levels of antiapoptosis proteins, including XIAP and BIRC5, and lacked the suppressed expression of BCL2 and DIABLO.
Proteomics to predict potential protein leads for targeted therapy
Because most potential drugs target proteins, we generated lists of potential drugable targets for each signature and constellation (Supplementary Fig. S8, figures available online on http://qutublab.rice.edu/pediatric-all/global/). These potential target leads were identified as being significantly over- and underexpressed relative to normal CD34+ cells. Figure 5A shows an example of all differentially expressed proteins when compared with the controls in signature 6 that comprised 3 of the 4 relapse patients. For example, proteins PARP1 and cleaved PARP1 together with LEF1, PIK3CA, and BRAF were all highly expressed. From these lists, we could then reveal proteins that were universally changed in the same direction in at least 6 of the 7 signatures (Fig. 5B). Hypothetically, rational combinations of targeted therapies directed against signature-specific proteins together with targeted therapies directed against universally altered expressed proteins could be used therapeutically in specific subsets of patients alone, or in addition to standard therapy, to overcome treatment resistance. However, this hypothesis needs validation with future experiments.
Significantly higher and lower expressed proteins relative to the normal CD34+ samples in signature 6. A, An example of all the significantly altered expressed proteins compared with the normal CD34+ cells for signature 6 is shown. Higher expressed proteins (up) suggest targets for inhibition and lower expressed proteins (down) suggest protein targets for replacement or activation. Blue circles denote proteins that were universally altered in similar direction in all signatures and red circles point out signature specific protein targets. Colors indicate the relative median protein expression for that signature, ranged from the lowest (dark blue) to the highest (maroon) expression. B, Proteins that were universally changed in the same direction (in ≥6 of the 7 signatures) compared with normal CD34+ samples are shown. Nonsignificantly different proteins compared with normal CD34+ samples are shown in white (blank).
Significantly higher and lower expressed proteins relative to the normal CD34+ samples in signature 6. A, An example of all the significantly altered expressed proteins compared with the normal CD34+ cells for signature 6 is shown. Higher expressed proteins (up) suggest targets for inhibition and lower expressed proteins (down) suggest protein targets for replacement or activation. Blue circles denote proteins that were universally altered in similar direction in all signatures and red circles point out signature specific protein targets. Colors indicate the relative median protein expression for that signature, ranged from the lowest (dark blue) to the highest (maroon) expression. B, Proteins that were universally changed in the same direction (in ≥6 of the 7 signatures) compared with normal CD34+ samples are shown. Nonsignificantly different proteins compared with normal CD34+ samples are shown in white (blank).
Selection of discriminative proteins to aid in risk stratification and determine therapy
To classify patients into one of the 7 protein signatures based on a limited number of proteins, Random Forest was utilized to select the proteins with the highest distinctiveness (Supplementary Fig. S9). This resulted in a correct classification rate of 78% (N = 57/73), whereby variation in protein expression enabled a higher than overall classification accuracy for signature 1 (N = 12/12, 100%), 5 (N = 12/14, 86%), 6 (N = 17/20, 85%), and 7 (N = 4/4, 100%). For instance, patients in signature 1 could be separated on the basis of their relatively low CDKN1A in combination with their relatively high GATA1 and NOTCH3, and signature 7 could be discerned on the basis of their low CASP3 levels. Inferior classification capability was found for signature 2 (N = 4/8, 50%) and 3 (N = 1/6, 17%), which may be explained given that neither signature 2 nor signature 3 was exclusively associated with any constellation and that none of the most discriminative proteins were significantly different compared with the other signatures.
Leukemic cell lines only partially mimic protein patterns
Leukemic cell lines are frequently used to investigate the pathobiology of leukemia, but immortalization and cryopreservation of those cells likely alter the biology of the cell from their leukemic patient cell of origin. To determine whether cell lines express differences or similarities in protein expression patterns compared with the pediatric ALL patient samples, we generated a new RPPA with 95 leukemic cell line samples, including cell lines derived from pediatric and adult ALL (e.g., Jurkat, REH), and AML patients (e.g., Kasumi-1, HL-60, Molm13, Molm14, OCIAML3). Arrays were probed with 235 antibodies of which 163 (N = 163/194, 84%) overlapped with the antibodies used on the pediatric patient array. Because the cell line and the pediatric acute leukemia patient array both had cells from healthy donors included, alignment of the control CD34+ samples from both enabled comparison of the arrays.
Overall, unsupervised hierarchical clustering and PCA clearly demonstrated completely distinct proteomic profiles for pediatric ALL patient samples and leukemic cell lines (Supplementary Fig. S10). Individual comparison of protein clusters showed that only 53 of the 114 (46%) protein clusters had at least one cell line equivalent (Fig. 1B; Supplementary Table S8). None of the constellations or signatures seen in the ALL patients was replicated in the cell lines.
Pediatric leukemia online portal
In addition to the PFG “FLI1” that is discussed in this article, results from all PFG analyses are published online and can be accessed at http://qutublab.rice.edu/pediatric-all/.
Discussion
Heterogeneity within the genetic and epigenetic landscape of pediatric ALL makes personalized medicine challenging. To assist in the process of both risk stratification and medication management, we have demonstrated that pediatric ALL could be characterized by the “meta-Galaxy” approach into a finite number of recurrent proteins expression patterns that could identify key protein targets based on individual protein expression.
The meta-Galaxy analysis is a two-step approach that starts with the analysis of proteins in the context of other proteins that are known to be functionally related or known to interact with each other, and then globally searches for protein patterns that frequently co-occur. This approach arises from the supposition that traditional unsupervised hierarchical clustering ignores known protein interactions and weights each component equally. We hypothesized that if we created smaller subsets of proteins with known functional relationships (i.e. protein functional groups, PFG) and then built overall interaction networks from individual proteins clusters within the PFG as building blocks, that we would find more robust protein patterns. Furthermore, this analysis provides insight into which protein patterns resemble normal cells, and which represent distinct protein expression patterns and activation states between protein clusters.
The existence of recurrent protein patterns led to our hypothesis that overexpressed proteins could function as candidate druggable targets for inhibition or deactivation, while under expressed proteins could function as targets for replacement or reactivation. This concept of replacement has been successfully demonstrated in acute promyelocytic leukemia, where RARα in the fusion gene cannot reach the nucleus, but all-trans retinoid acid can replace this loss of function (39). For proteins that are over expressed or significantly activated, use of small-molecule inhibitors to has proved a viable strategy. The paradigm for this is the use of imatinib (Gleevec) and other tyrosine kinase inhibitors (e.g. bosutinib, nelotinib, and dasatinib) to suppress the constitutively activated ABL kinase activity seen in Ph+ leukemia patients (40). By identifying many targets for each signature, possible rational combinations of targeted therapy could be identified that could be used alone, or in combination with standard chemotherapy. For instance, reactivation of the universally suppressed GATA1 may be useful in inducing differentiation during hematopoiesis (41). Likewise, the universal loss of NR4A1 (42, 43) and TCF4 (44) expression poses an opportunity to restore stem cell regulation by restoring normal expression and/or function. To test this hypothesis, we performed proteomic profiling on leukemia cell lines to find representative cell lines that resemble with the protein expression patterns seen in pediatric ALL patients. However, only half of the protein clusters in patients showed similarities to cell lines, calling into question the relevance of leukemia cell lines in testing drug combinations in future experiments.
If aberrantly expressed proteins could aid in determining patient's risk group, then classification based on protein signatures could be performed at diagnosis and implemented during risk stratification (i.e. prior to consolidation therapy). This process would first need to be tested and validated in larger datasets with more divergence in therapy outcome. If this methodology were shown to be predictive, development of an ELISA or forward-phase protein array kit could potentially classify patients in real time, making routine determination of protein signature membership both feasible and potentially useful for postinduction therapy determination.
A highly important observation was the association of the Hispanic ethnicity with signatures. Numerous studies have reported an inferior outcome for patients with Hispanic ethnicity. It is uncertain whether this arises from a different pathophysiology or socioeconomic factors. We observed a clear skewing of some Hispanic patients to specific signatures, suggesting that for many Hispanic patients the difference in outcome arises from underlying differences in the pathophysiology of their leukemia. A similar finding was noted by Harvey and colleagues who observed that, within high-risk pediatric ALL, there was a gene expression signature associated with the Hispanic ethnicity that had a very poor 4-year relapse-free survival (45). In our study, this was most pronounced in the differential expression of two T-ALL constellations. Protein expression summaries were notable for over expression of CCND3, DUSP6, RB1, RB1.pS807_811, and STMN1, and underexpression of BCL2 in non-Hispanic groups, and overexpression of FKHRL1 along with decreased expression of XIAP and BIRC5 in the Hispanic enriched signatures. This suggests that leukemia in Hispanics is associated with a less proliferative “push” in combination with greater resistance to apoptosis due to relatively higher levels of BCL2 and “IAP-proteins” BIRC5 and XIAP. As malignancies with higher proliferation rates are more sensitive to cell-cycle–specific chemotherapy agents, and as cells with reduced anti-apoptitic potential are less likely to survive chemotherapy, the constellations provide plausible explanations as to why some Hispanic ALL patients do worse than their non-Hispanic counterparts. However, this observation first needs further validation in a larger patient cohort.
One of the limitations in our study was the small number of patient samples and the restricted number of antibodies targeting phosphorylation sites represented in the array. Repetition of the analysis in a larger cohort of patients will enable identification of more protein signatures that could more accurately discriminate patients and would likely show heterogeneity in outcome. Moreover, it would be interesting to test how additional mutational analysis using genomic and gene expression sequencing could provide more insight in correlation between mutational events and protein expression. A previous study in pediatric ALL observed correlation between the mutational state of NOTCH1 and/or FBXW7 with aberrant NOTCH1 protein expression. However, they also observed NOTCH1 protein activation in some patients without the presence of a NOTCH1 mutation (46). Another study showed that although patients with mutations in the PTEN/AKT pathway were found to have decreased expression of PTEN compared with wild-type controls, there was no difference in phosphorylation of AKT or downstream AKT targets (47).
In conclusion, our findings demonstrate the existence of protein signatures and protein constellations in a cohort of pediatric ALL patients. Elaboration of this approach could be extended to other diseases as well, to compare protein signatures across diseases and to identify disease-specific and universal protein expression patterns.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: F.W. Hoff, S.M. Kornblau, T.M. Horton
Development of methodology: C.W. Hu, A.A. Qutub, S.M. Kornblau, T.M. Horton
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): Y. Qiu, M.E. Scheurer, E.S.J.M. De Bont, S.M. Kornblau, T.M. Horton
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): F.W. Hoff, C.W. Hu, S.Y. Yoo, M.E. Scheurer, A.A. Qutub, S.M. Kornblau, T.M. Horton
Writing, review, and/or revision of the manuscript: F.W. Hoff, C.W. Hu, M.E. Scheurer, E.S.J.M. De Bont, A.A. Qutub, S.M. Kornblau, T.M. Horton
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): A. Ligeralde, S.M. Kornblau
Study supervision: A.A. Qutub, S.M. Kornblau, T.M. Horton
Acknowledgments
This research was funded by the Hyundai Hope on Wheels research grant (to T.M. Horton), the Takeda/Millennium R01-CA164024 grant (to T.M. Horton), the St Baldrick's Consortium Grant (to M.E. Scheurer), and the Ladies Leukemia League (to M.E. Scheurer).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.