Abstract
Purpose: To characterize gene expression signatures in acute lymphocytic leukemia (ALL) cells associated with known genotypic abnormalities in adult patients.
Experimental Design: Gene expression profiles from 128 adult patients with newly diagnosed ALL were characterized using high-density oligonucleotide microarrays. All patients were enrolled in the Italian GIMEMA multicenter clinical trial 0496 and samples had >90% leukemic cells. Uniform phenotypic, cytogenetic, and molecular data were also available for all cases.
Results: T-lineage ALL was characterized by a homogeneous gene expression pattern, whereas several subgroups of B-lineage ALL were evident. Within B-lineage ALL, distinct signatures were associated with ALL1/AF4 and E2A/PBX1 gene rearrangements. Expression profiles associated with ALL1/AF4 and E2A/PBX1 are similar in adults and children. BCR/ABL+ gene expression pattern was more heterogeneous and was most similar to ALL without known molecular rearrangements. We also identified a set of 83 genes that were highly expressed in leukemia blasts from patients without known molecular abnormalities who subsequently relapsed following therapy. Supervised analysis of kinase genes revealed a high-level FLT3 expression in a subset of cases without molecular rearrangements. Two other kinases (PRKCB1 and DDR1) were highly expressed in cases without molecular rearrangements, as well as in BCR/ABL-positive ALL.
Conclusions: Genomic signatures are associated with phenotypically and molecularly well defined subgroups of adult ALL. Genomic profiling also identifies genes associated with poor outcome in cases without molecular aberrations and specific genes that may be new therapeutic targets in adult ALL.
After more than four decades of intensive research, the cellular origins of acute lymphocytic leukemia (ALL) have been well defined, and several distinct genetic mechanisms that lead to malignant transformation of these cells have been identified (1–4). In ∼25% to 30% of adult patients, ALL cells express multiple genes associated with T-cell differentiation, and T-cell receptor genes show the presence of clonal rearrangements. In the remaining 70% to 75% of adult patients, ALL cells express markers of B-cell differentiation, and immunoglobulin genes reveal unique patterns of clonal rearrangement. These characteristics clearly define the derivation of these leukemias from distinct lymphoid precursor cells that have been committed to either T-lineage or B-lineage differentiation. Although chromosome translocations and molecular rearrangements are relatively infrequent in T-lineage ALL, these events occur commonly in B-lineage ALL and reflect distinct mechanisms of transformation. The relative frequencies of specific molecular rearrangements differ in children and adults with B-lineage ALL. In adult ALL, the BCR/ABL gene rearrangement occurs in about 25% of cases and the ALL1/AF4 gene rearrangement (also termed MLL/AF4) is present in 4% to 7% of cases (5). The BCR/ABL gene rearrangement occurs much less frequently in pediatric ALL, whereas the ALL1/AF4 or other mixed-lineage leukemia (MLL) rearrangements occur primarily in infants with this disease. The TEL/AML1 gene rearrangement, which is usually associated with a favorable outcome, is present in 25% to 30% of pediatric cases but is rare in adults (1, 6). Because these cytogenetic abnormalities reflect distinct mechanisms of transformation, these differences help to explain why children and adults with ALL have such different outcomes following conventional therapy.
Despite the frequent presence of chromosome translocations and gene rearrangements in ALL, no molecular rearrangements are found in ∼50% of adult cases. Although the cellular origins of these leukemias can be readily determined by phenotypic studies, the mechanisms of transformation of these leukemias remain unknown. In this context, large-scale gene expression profiling provides a new approach to explore the mechanisms of transformation of malignant cells. In pediatric ALL, several reports have identified specific gene signatures that reflect known phenotypic characteristics and genetic rearrangements, such as ALL1/AF4, TEL/AML1, E2A/PBX1, and to a lesser extent BCR/ABL (7–11). This approach has also led to the identification of gene expression patterns that may predict response to therapy (8, 9, 12, 13), as well as new potential therapeutic targets (14, 15).
The current study used high-density oligonucleotide microarrays (Affymetrix U95Av2, Santa Clara, CA) to define the gene expression profile of ALL in a well-characterized population of 128 adult patients at the time of diagnosis, before initiation of intensive chemotherapy. Analysis of pretreatment ALL cells also included the uniform characterization of cell surface phenotype and molecular abnormalities. This resulted in the identification of gene expression signatures that were associated with well-defined genetic abnormalities known to play a central role in malignant transformation of these leukemia cells and are similar to those previously reported in children (9). To characterize potential pathways of transformation in ALL cells without known genetic abnormalities, we compared the gene expression profiles of these cases with that of ALL with defined gene rearrangements. This comparison showed that the expression profile of B-lineage ALL without known molecular rearrangements was more similar to cases with BCR/ABL rearrangements than ALL with other genetic rearrangements. Because the BCR/ABL rearrangement results in the formation of an activated tyrosine kinase, these observations suggest that deregulated activation of tyrosine kinase genes may also play an important role in the transformation of at least some ALL cases without other defined molecular rearrangements.
Materials and Methods
Patient samples. Patients were enrolled in the Italian GIMEMA multicenter clinical trial 0496 for adult patients with ALL. Informed written consent was obtained from all patients before therapy. The present study was approved by the institutional review board of the University “La Sapienza” of Rome, Italy. Pretreatment leukemia samples from bone marrow and/or peripheral blood were centrally processed at University “La Sapienza” of Rome; mononuclear cells were isolated by density gradient centrifugation and cryopreserved with 10% DMSO. One hundred twenty-eight leukemia samples containing >90% blasts were selected for gene expression analysis from 431 patients enrolled between 1996 and 2000. Follow-up data were collected at University “La Sapienza”.
Flow cytometry, cytogenetic, and molecular studies. Conventional immunophenotypic, cytogenetic, and molecular diagnostic studies were done on all samples (16). Flow cytometry analysis was done using monoclonal antibodies specific for the following antigens to determine lineage derivation: Tdt, HLA-DR, CD7, CD19, CD10, CD14, CD33, CD13, CD61, CD34, CD2, myeloperoxidase, and sCD3. For ALL cells assigned to either B-lineage or T-cell lineage, expression of the following panel of antigens was used to define the degree of differentiation of leukemic blasts: CD20, cyCD22, cyIgμ, surface immunoglobulin, cyCD3, CD5, CD1a, CD4, and CD8.
RNA extraction. Cryopreserved leukemia cells were rapidly thawed and total RNA was extracted using Trizol reagent (Life Technologies Bethesda Research Laboratories, Grand Island, NY) and purified using SV Total RNA Isolation System, (Promega, Madison, WI) according to the manufacturer's protocol with minor modifications. To assess RNA quality, 2 μL of RNA from each sample were analyzed by agarose gel electrophoresis; 260/280 ratio was >1.8 for all samples used for microarray analysis (17).
Oligonucleotide microarray. The detailed protocol for sample preparation and microarray processing is available from Affymetrix (http://www.affymetrix.com). Briefly, first-strand cDNA was synthesized from 5 μg total RNA using a T7-(dT)24 primer (Genset Corp., San Diego, CA) and reverse transcribed with the SuperScript Double-Stranded cDNA Synthesis kit (Life Technologies Bethesda Research Laboratories). After synthesis of the second strand of cDNA, the product was used an in vitro transcription reaction (Bioarray kit, Enzo Diagnostics, Farmingdale, NY) to generate biotinylated complementary RNA and subsequently fragmented. Fragmented complementary RNA (15 μg/sample) was hybridized to the Affymetrix U95Av2 GeneChip, which contains ∼12,600 sequences derived from the Genbank database. As suggested by the manufacturer, samples with 3′/5′ glyceraldehyde-3-phosphate dehydrogenase (GAPDH) ratio of >3 were discarded. After hybridization, each microarray was washed, stained, and scanned with an argon-ion confocal laser, with excitation at 488 nm and detection at 570 nm. CEL files are available at http://www.Bioconductor.org/Docs/Papers/2003/Chiaretti.
Data analysis and statistical methods. Affymetrix gene expression data were processed with dChip (http://www.dchip.org), which uses an invariant set normalization method. The array with the median overall intensity was chosen as the baseline for normalization. Model-based expressions were computed for each array and probe set (18) using only perfect match probes. All other analyses were carried out using the R language (http://www.r-project.org; ref. 19), included in the Bioconductor package.
Nonspecific filtering criteria for unsupervised clustering were defined as follows: (a) gene expression level was required to be higher than 100 in >20% of the samples; (b) the ratio of SD to the mean expression across all samples was required to be between 0.6 and 10. We chose these broad criteria because we wanted to concentrate on genes that were common to many B-lineage ALLs or many T-lineage ALLs and not on small subgroups of either population.
To specifically identify genes differentially expressed in pre–B-ALL molecular groups (i.e., ALL1/AF4, BCR/ABL, and E2A/PBX1) and the remaining samples without evidence of major molecular rearrangements (labeled as NEG in the text and figures), we first computed the mean expression of the 12,625 probe sets within each group and only probes with at least one group mean above 100 were retained. Subsequently, an ANOVA analysis aimed at testing if there were any differences among the four group means was done for each probe on log-transformed expression values.
P adjustments for multiple comparisons were done using the step-up false discovery rate controlling method proposed by Beniamini and Hochberg (20), with false discovery rate controlled under 0.05. Hierarchical clustering and heat maps were used as described by Ihaka and Gentleman (19) and Eisen et al. (21). The distance between two genes was computed as 1 − the correlation between the standardized expression values across samples. False-color maps were used to display distance matrices between samples. The distance between two samples was computed as 1 − the correlation between the aforementioned standardized expression values of the samples. The magnitude of distance is represented in color with a continuous spectrum from blue (close) to red (far). The diagonal of the distance map represents the distance of n samples to themselves (i.e., 0) and is colored in the deepest shade of blue.
For prediction of biological groups, a leave-one-out cross-validation was used in the process of probe selection and nearest-neighbor prediction. Specifically, we left one sample out as the test set and retained the other n − 1 samples as our training set. For each training set, we developed three different gene lists; one for each comparison of the NEG group to a group with a known molecular aberration. Genes were filtered using the same criteria as for identification of probes specifically associated with molecular groups, leaving one sample out, in turn; then, the class of the left-out sample was predicted, using the set of genes obtained for its true class, using one-nearest neighbor classification.
A t test was used to identify genes differentially expressed between patients who are in continuous complete remission (CCR) versus those who experienced a relapse. Genes were required to have an average expression above 100 in at least one group, a fold change difference of >1.5, and an adjusted P = 0.05. To strengthen these results, a leave-one-out cross-validation was also done and only genes that were selected in 100% analyses were retained.
We also concentrated on genes annotated at the protein-tyrosine kinase node of Gene Ontology (Gene Ontology: 0004713, 299 probe sets) and used a similar procedure. Again, gene lists reflecting pairwise comparisons of the NEG group to the other three were obtained. The selection criteria included (a) an average expression value of >100 in at least one group, (b) a nominal P < 0.05, and (c) a fold change difference of ≥1.5. To identify additional tyrosine kinase genes highly expressed in the NEG group, we identified genes that were highly expressed in the following: (a) all four molecular subsets, (b) the NEG group and any two other subsets, and (c) the NEG group and any other single subset. For this analysis only, gene selection criteria required a mean expression of >500 (sufficiently high values), and equal expression between groups was arbitrarily defined as a mean fold difference between 0.9 and 1.1.
Real-time quantitative PCR. Real-time quantitative reverse transcription-PCR analysis was done using an ABI PRISM 7700 Sequence Detection System and the SYBR green I dye (PE Biosystems, Foster City, CA) method as previously described (22). Real-time PCR conditions were as follows: one cycle at 50°C for 2 minutes, one cycle at 95°C for 10 minutes, one cycle at 95°C for 15 seconds, one cycle at 60°C for 1 minute, for a total of 40 cycles. For each sample, CT values for GAPDH were determined for normalization purposes. For each run of samples, a correction factor was calculated by dividing by the minimum GAPDH values and applied to normalize the CT values of the genes of interest. Primers were designed using the Primer Express 1.0 software. The following primers were used: 5′ GAPDH, 5′-CCACCCATGGCAAATTCC-3′; 3′ GAPDH, 5′-GATGGGATTTCCATTGATGACA-3′; 5′ FLT3, 5′-AAATTCCACCAGCATGCCTGGTT-3′, 3′ FLT3, 5′-TCCGAGTCCGGGGTGTATCTG-3′; 5′ DDR1, 5′-GATTTCCCCCTTAATGTGCGT-3′, 3′ DDR1, 5′-TGGCATCTGGCCGTAAGATC-3′; 5′ PRKCB1, 5′-CAGAAGGAAGTGAGGCCAATG-3′; and 3′ PRKCB1, 5′-TCTTCCGGGACCTTGGTTC-3′.
Results
Patient characteristics. Conventional immunophenotypic, cytogenetic, and molecular diagnostic studies were done on pretreatment leukemia samples from the 128 patients (Table 1; Supplementary Table 1). The median age of patients was 29 years (range, 15-58); 43 were females and 85 were males. Immunophenotypic analysis indicated T-cell derivation in 33 samples and B-cell lineage in 95 samples. Well-defined cytogenetic abnormalities were identified in 43 cases. Other simple or complex karyotypic abnormalities were found in 22 samples and a normal karyotype was identified in 24 samples. Thirty-nine cases did not have sufficient metaphases for cytogenetic analysis. Molecular diagnostic studies confirmed the presence of BCR/ABL gene rearrangements in all 26 patients with t(9;22) and in 11 additional samples which were not evaluable by cytogenetic analysis. The ALL1/AF4 gene rearrangement was detected in all seven pro–B ALL cases with t(4;11), in one patient who was not evaluable by conventional cytogenetic testing, and in two samples with a normal karyotype. The E2A/PBX1 rearrangement was found in five samples, including two with t(1;19), one with a normal karyotype, and two with insufficient metaphases for cytogenetic analysis. BCR/ABL, ALL1/AF4, and E2A/PBX1 gene rearrangements were only detected in leukemias with a B-lineage cell phenotype. A distinct t(4;11) rearrangement involving the NUP98 gene was found in a single T-lineage ALL (23, 24). Thirty-six cases that did not harbor any major molecular abnormality were evaluable for long-term outcome: 12 patients were in CCR with a median follow-up of 60.5 months (range, 29-76), whereas the remaining 24 experienced a relapse within a median period of 11 months (range, 2-69).
. | Pre–B-ALL (n = 95) . | T-ALL (n = 33) . | ||
---|---|---|---|---|
Conventional cytogenetic analysis | ||||
t(9;22) | 26 | 0 | ||
t(4;11) | 7 | 1 | ||
t(10;14) | 0 | 1 | ||
t(1;19) | 2 | 0 | ||
del(6q) | 2 | 2 | ||
del(7q) | 1 | 1 | ||
Simple karyotypic rearrangements | 9 | 4 | ||
Complex karyotypic abnormalities | 6 | 3 | ||
Normal karyotype | 16 | 8 | ||
Not evaluable | 26 | 13 | ||
Molecular rearrangements | ||||
BCR/ABL | 37* | 0 | ||
ALL1/AF4 | 10 | 0 | ||
E2A/PBX1 | 5* | 0 | ||
NUP-98 | 0 | 1 | ||
NEG | 42* | 32 |
. | Pre–B-ALL (n = 95) . | T-ALL (n = 33) . | ||
---|---|---|---|---|
Conventional cytogenetic analysis | ||||
t(9;22) | 26 | 0 | ||
t(4;11) | 7 | 1 | ||
t(10;14) | 0 | 1 | ||
t(1;19) | 2 | 0 | ||
del(6q) | 2 | 2 | ||
del(7q) | 1 | 1 | ||
Simple karyotypic rearrangements | 9 | 4 | ||
Complex karyotypic abnormalities | 6 | 3 | ||
Normal karyotype | 16 | 8 | ||
Not evaluable | 26 | 13 | ||
Molecular rearrangements | ||||
BCR/ABL | 37* | 0 | ||
ALL1/AF4 | 10 | 0 | ||
E2A/PBX1 | 5* | 0 | ||
NUP-98 | 0 | 1 | ||
NEG | 42* | 32 |
Abbreviations: B-ALL, B-lineage ALL; T-ALL, T-lineage ALL.
p15 or p15/p16 deletions were detected in six BCR/ABL samples, one E2A/PBX1 sample, and five samples without molecular rearrangements.
Gene expression profiles distinguish T-cell and B-cell lineage derivation. Applying the filtering criteria to all 128 samples resulted in the selection of 792 genes, which showed sufficient levels of expression and variation across groups of interest. Unsupervised hierarchical clustering based on the expression of this set of genes identified two groups that coincided precisely with the phenotypic classification of T-cell and B-cell lineage determined by immunophenotypic analysis of cell surface markers (Fig. 1). As expected, T-lineage ALL samples were characterized in part by the homogeneous expression of known T-cell lineage antigens such as CD3δ, CD3ζ, and TCR-associated genes, and B-lineage ALL were characterized by the homogeneous expression of MHC class II, CD19, and immunoglobulin-associated genes. However, many genes not known to be associated with either T-cell or B-cell lineage were also differentially expressed in these two sets of samples. As shown in Fig. 1, the T-lineage ALL group seemed relatively homogeneous. In contrast, the B-lineage ALL group seemed heterogeneous, and it was possible to identify smaller subsets of samples based on a variable profile of expression within this large set of differentially expressed genes.
Gene expression profile of adult B-lineage acute lymphocytic leukemia identifies signatures of ALL1/AF4, E2A/PBX1, and BCR/ABL gene rearrangements. To further characterize the B-lineage ALL samples, we first performed unsupervised hierarchical clustering of the 95 samples. Different filtering criteria were applied (see Data analysis and statistical methods), and regardless of the criteria used in gene filtering, cases with E2A/PBX1 and ALL1/AF4 gene rearrangements formed two distinct clusters that corresponded precisely to the molecular classification of these samples (data not shown). In contrast, samples with BCR/ABL rearrangements showed a more heterogeneous profile, being distributed in several clusters. Moreover, each BCR/ABL cluster also included NEG samples. Similarly, NEG cases could not be grouped into distinct clusters. These results suggested that both BCR/ABL-positive cases and NEG samples are likely to include several subgroups, and also that the gene expression profiles of these two groups were more similar to each other than to other molecularly defined groups.
Subsequently, to identify genes associated with known molecular abnormalities in the B-lineage ALL, samples were assigned to one of four groups: ALL1/AF4, E2A/PBX1, BCR/ABL, and NEG. ANOVA was applied to identify genes with significantly different mean expression values in at least one of the four B-lineage ALL groups. This resulted in the selection of 167 probes with adjusted Ps < 0.0001. The top genes in each category with an adjusted P < 1 × 10−5 are listed in Table 2. The complete list of probes is summarized in the Supplementary Material. As shown in Fig. 2, clustering using this set of genes led to the correct grouping of all ALL1/AF4 and E2A/PBX1 samples, whereas BCR/ABL-positive cases were again clustered together with NEG samples.
Gene . | Locus link ID . | Adjusted P . | Function . | |||
---|---|---|---|---|---|---|
ALL1/AF4 | ||||||
HOXA9 | 3205 | <10−13 | Transcription factor | |||
HOXA10 | 3206 | <10−13 | Transcription factor | |||
VLDLR | 7436 | <10−13 | Receptor | |||
MEIS1 | 4211 | <10−13 | RNA polymerase II transcription factor | |||
DPYSL3 | 1809 | <10−13 | Nucleic acid metabolism | |||
PALLADIN | 23022 | 1 × 10−13 | Cytoskeleton | |||
KIAA1157 | 57460 | 1.5 × 10−12 | Unknown | |||
PROM1 | 8842 | 2.4 × 10−12 | Membrane protein | |||
CCNA1 | 8900 | 1.3 × 10−11 | Cell cycle control | |||
LGALS1 | 3956 | 2.3 × 10−10 | Control of cell proliferation | |||
PRSS12 | 8492 | 3.2 × 10−10 | Serine protease | |||
NPTX2 | 4885 | 4.4 × 10−10 | Pentaxin | |||
FUT4 | 2526 | 1 × 10−9 | Fucosyltransferase | |||
TBC1D8 | 11138 | 4.8 × 10−9 | Positive control of cell proliferation | |||
DIAPH2 | 1730 | 8 × 10−9 | Actin binding | |||
DAD1 | 1603 | 1.2 × 10−7 | Antiapoptosis | |||
HOXA5 | 3202 | 1.5 × 10−7 | Transcription factor | |||
RHOBTB3 | 22836 | 3 × 10−7 | Unknown | |||
KIAA0125 | 9834 | 4.1 × 10−7 | Unknown | |||
E2A/PBX1 | ||||||
PBX1 | 5087 | <10−13 | Transcription factor | |||
FAT | 2195 | <10−13 | Cell adhesion | |||
KANK | 23189 | 2.5 × 10−12 | Cell cycle control | |||
FGF9 | 2254 | 1.7 × 10−11 | Cell cycle control | |||
CRYM | 1428 | 2.2 × 10−11 | Vision | |||
NID2 | 22795 | 2.3 × 10−11 | Cell adhesion | |||
KIAA0802 | 23255 | 8.7 × 10−11 | Unknown | |||
LAMA5 | 3911 | 1.8 × 10−9 | Differentiation | |||
AOX1 | 316 | 1.9 × 10−9 | Oxidoreductase activity | |||
SLAM | 6504 | 3.1 × 10−9 | B-cell proliferation | |||
PRKCZ | 5590 | 5.8 × 10−8 | Protein kinase | |||
MERTK | 10461 | 1.3 × 10−7 | Oncogene/tyrosine kinase | |||
ALDH1A1 | 216 | 1.5 × 10−7 | Aldehyde dehydrogenase activity | |||
PKM2 | 5315 | 4.3 × 10−7 | Pyruvate kinase | |||
Unknown | Unknown | 1.4 × 10−6 | Unknown | |||
ACK1 | 10188 | 1.4 × 10−6 | Non–receptor tyrosine kinase | |||
NCBP2 | 22916 | 1.6 × 10−6 | RNA binding | |||
BLK | 640 | 2.1 × 10−6 | Protein tyrosine kinase | |||
TMSNB | 11013 | 2.4 × 10−6 | Actin binding | |||
TRIB2 | 28951 | 1.5 × 10−5 | Protein kinase | |||
CGI-49 | 51097 | 1.6 × 10−5 | Unknown | |||
LRMP | 4033 | 1.7 × 10−5 | Lymphocyte development | |||
KIAA0889 | 25781 | 2.1 × 10−5 | Unknown | |||
E2F5 | 1875 | 2.1 × 10−5 | Transcription factor | |||
BCR/ABL | ||||||
CDW52 | 1043 | 9.4 × 10−13 | Integral to plasma membrane | |||
ABL1 | 25 | 1.3 × 10−11 | Tyrosine kinase activity | |||
YES1 | 7525 | 6.1 × 10−11 | Tyrosine kinase activity | |||
SOCS2 | 8835 | 5.7 × 10−10 | Regulation of cell growth | |||
CCND2 | 894 | 3 × 10−9 | Cell cycle progression | |||
HLA-DPA1 | 3113 | 4.8 × 10−9 | HLA class II antigen | |||
HLA-DOA | 3111 | 3.6 × 10−8 | HLA class II antigen | |||
CD2AP | 23607 | 4 × 10−8 | Cytoskeleton | |||
TCF8 | 6935 | 5 × 10−8 | Growth inhibition | |||
HLA-C | 3107 | 5.9 × 10−8 | HLA class II antigen | |||
FYN | 2534 | 1 × 10−7 | Protein kinase activity | |||
HLA-DPB1 | 3115 | 1.3 × 10−7 | HLA class II antigen | |||
GAB1 | 2549 | 1.6 × 10−7 | Cell proliferation | |||
IFITM1 | 8519 | 3.6 × 10−7 | IFN response | |||
OPTN | 10133 | 3.6 × 10−7 | TNF inhibition | |||
ITGA6 | 3655 | 5.9 × 10−7 | Integrin complex | |||
HLA-A | 3105 | 7.1 × 10−7 | HLA class I antigen | |||
PSAP | 5660 | 9.2 × 10−7 | Saposin | |||
E2F2 | 1870 | 1.1 × 10−6 | Cell cycle progression | |||
CD24 | 934 | 1.8 × 10−6 | Plasma membrane |
Gene . | Locus link ID . | Adjusted P . | Function . | |||
---|---|---|---|---|---|---|
ALL1/AF4 | ||||||
HOXA9 | 3205 | <10−13 | Transcription factor | |||
HOXA10 | 3206 | <10−13 | Transcription factor | |||
VLDLR | 7436 | <10−13 | Receptor | |||
MEIS1 | 4211 | <10−13 | RNA polymerase II transcription factor | |||
DPYSL3 | 1809 | <10−13 | Nucleic acid metabolism | |||
PALLADIN | 23022 | 1 × 10−13 | Cytoskeleton | |||
KIAA1157 | 57460 | 1.5 × 10−12 | Unknown | |||
PROM1 | 8842 | 2.4 × 10−12 | Membrane protein | |||
CCNA1 | 8900 | 1.3 × 10−11 | Cell cycle control | |||
LGALS1 | 3956 | 2.3 × 10−10 | Control of cell proliferation | |||
PRSS12 | 8492 | 3.2 × 10−10 | Serine protease | |||
NPTX2 | 4885 | 4.4 × 10−10 | Pentaxin | |||
FUT4 | 2526 | 1 × 10−9 | Fucosyltransferase | |||
TBC1D8 | 11138 | 4.8 × 10−9 | Positive control of cell proliferation | |||
DIAPH2 | 1730 | 8 × 10−9 | Actin binding | |||
DAD1 | 1603 | 1.2 × 10−7 | Antiapoptosis | |||
HOXA5 | 3202 | 1.5 × 10−7 | Transcription factor | |||
RHOBTB3 | 22836 | 3 × 10−7 | Unknown | |||
KIAA0125 | 9834 | 4.1 × 10−7 | Unknown | |||
E2A/PBX1 | ||||||
PBX1 | 5087 | <10−13 | Transcription factor | |||
FAT | 2195 | <10−13 | Cell adhesion | |||
KANK | 23189 | 2.5 × 10−12 | Cell cycle control | |||
FGF9 | 2254 | 1.7 × 10−11 | Cell cycle control | |||
CRYM | 1428 | 2.2 × 10−11 | Vision | |||
NID2 | 22795 | 2.3 × 10−11 | Cell adhesion | |||
KIAA0802 | 23255 | 8.7 × 10−11 | Unknown | |||
LAMA5 | 3911 | 1.8 × 10−9 | Differentiation | |||
AOX1 | 316 | 1.9 × 10−9 | Oxidoreductase activity | |||
SLAM | 6504 | 3.1 × 10−9 | B-cell proliferation | |||
PRKCZ | 5590 | 5.8 × 10−8 | Protein kinase | |||
MERTK | 10461 | 1.3 × 10−7 | Oncogene/tyrosine kinase | |||
ALDH1A1 | 216 | 1.5 × 10−7 | Aldehyde dehydrogenase activity | |||
PKM2 | 5315 | 4.3 × 10−7 | Pyruvate kinase | |||
Unknown | Unknown | 1.4 × 10−6 | Unknown | |||
ACK1 | 10188 | 1.4 × 10−6 | Non–receptor tyrosine kinase | |||
NCBP2 | 22916 | 1.6 × 10−6 | RNA binding | |||
BLK | 640 | 2.1 × 10−6 | Protein tyrosine kinase | |||
TMSNB | 11013 | 2.4 × 10−6 | Actin binding | |||
TRIB2 | 28951 | 1.5 × 10−5 | Protein kinase | |||
CGI-49 | 51097 | 1.6 × 10−5 | Unknown | |||
LRMP | 4033 | 1.7 × 10−5 | Lymphocyte development | |||
KIAA0889 | 25781 | 2.1 × 10−5 | Unknown | |||
E2F5 | 1875 | 2.1 × 10−5 | Transcription factor | |||
BCR/ABL | ||||||
CDW52 | 1043 | 9.4 × 10−13 | Integral to plasma membrane | |||
ABL1 | 25 | 1.3 × 10−11 | Tyrosine kinase activity | |||
YES1 | 7525 | 6.1 × 10−11 | Tyrosine kinase activity | |||
SOCS2 | 8835 | 5.7 × 10−10 | Regulation of cell growth | |||
CCND2 | 894 | 3 × 10−9 | Cell cycle progression | |||
HLA-DPA1 | 3113 | 4.8 × 10−9 | HLA class II antigen | |||
HLA-DOA | 3111 | 3.6 × 10−8 | HLA class II antigen | |||
CD2AP | 23607 | 4 × 10−8 | Cytoskeleton | |||
TCF8 | 6935 | 5 × 10−8 | Growth inhibition | |||
HLA-C | 3107 | 5.9 × 10−8 | HLA class II antigen | |||
FYN | 2534 | 1 × 10−7 | Protein kinase activity | |||
HLA-DPB1 | 3115 | 1.3 × 10−7 | HLA class II antigen | |||
GAB1 | 2549 | 1.6 × 10−7 | Cell proliferation | |||
IFITM1 | 8519 | 3.6 × 10−7 | IFN response | |||
OPTN | 10133 | 3.6 × 10−7 | TNF inhibition | |||
ITGA6 | 3655 | 5.9 × 10−7 | Integrin complex | |||
HLA-A | 3105 | 7.1 × 10−7 | HLA class I antigen | |||
PSAP | 5660 | 9.2 × 10−7 | Saposin | |||
E2F2 | 1870 | 1.1 × 10−6 | Cell cycle progression | |||
CD24 | 934 | 1.8 × 10−6 | Plasma membrane |
NOTE: Genes are rank ordered according to their adjusted P.
Abbreviations: B-ALL, B-lineage ALL; TNF, tumor necrosis factor.
ALL1/AF4 was characterized by high levels of expression of a large set of genes (∼50). As previously reported, this gene rearrangement was strongly associated with the up-regulation of several HOX family genes (9, 14, 25), as well as the FLT3 receptor (15). Five of these 10 samples were further examined for the presence of either internal tandem duplication or D835 mutations of FLT3 receptor and none showed any abnormalities.
The E2A/PBX1 gene rearrangement also showed a distinct and homogeneous pattern characterized by the high level of expression of several genes (∼35). The genes most strongly associated with this group were represented by PBX1, FAT, NID2, and KANK (26, 27). Interestingly, two samples without known molecular rearrangements clustered tightly with the E2A/PBX1 group. In one case, conventional cytogenetic analysis revealed a normal karyotype, but the second case did not have sufficient metaphases for cytogenetic analysis. Molecular diagnostic studies did not reveal the E2A/PBX1 rearrangement in these two samples. However, comparative genomic hybridization showed gains in the 1q region in both cases (1q23 in one case).
The BCR/ABL group was characterized by high expression of ∼70 genes. This included several tyrosine kinase genes (ABL, FYN, and YES1) and genes associated with cell cycle progression (CCDN2 and E2F2). However, compared with the ALL1/AF4 and E2A/PBX1 groups, the gene expression profile in BCR/ABL-positive ALL was more heterogeneous. This heterogeneity was not associated with the expression of different BCR/ABL-derived fusion proteins (p190 and p210; data not shown).
Gene expression profiles reveal similarities between BCR/ABL-positive acute lymphocytic leukemia and samples without molecular rearrangements. Although our analysis did not identify a distinct or homogeneous gene expression pattern in NEG ALL samples, these cases seemed to have gene expression signatures that were more similar to cases with the BCR/ABL rearrangement than with other molecular rearrangements. To further characterize the differences and similarities between different B-lineage ALL subsets, we used two approaches. First, we generated a distance map based on the expression of the 167 probes selected by ANOVA (Fig. 3). A distance map compares each sample with every other; the diagonal represents the comparison of each sample with itself and establishes a correlation of 1 (represented in blue) and hence a distance of 0. In this representation, samples with a high degree of correlation will appear blue. As shown in Fig. 3, samples with ALL1/AF4 cluster together and are quite distinct from all other groups (the large blue rectangles centered on the diagonal). Samples with E2A/PBX1 and known 1q gain also cluster together and are quite distinct. In contrast, the BCR/ABL samples do not homogeneously cluster together and are admixed with samples that do not have known molecular rearrangements.
The knn CV prediction analysis was used to evaluate the error rate comparing each group to NEG. As summarized in Table 3, the prediction error rate was 0% for ALL1/AF4 and 4% for E2A/PBX1. One NEG sample and one E2A/PBX1 sample were misclassified. The misclassified negative sample has a known gain at 1q23. In contrast, seven NEG and five BCR/ABL cases were misclassified for BCR/ABL, and the overall classification error rate was 15%. This relatively high error rate again suggests that these two groups are more difficult to distinguish than the others at the gene expression level. Notably, the NEG cases that were misclassified did not have distinct biological characteristics. Of the seven NEG patients that were misclassified, three experienced a relapse, one died during induction chemotherapy, one following stem cell transplantation procedures, and only two are in CCR.
Molecular groups . | Model prediction . | . | Error rate (%) . | |
---|---|---|---|---|
ALL1/AF4 | Negative | |||
ALL1/AF4 | 10 | 0 | 0 | |
NEG* | 0 | 42 | ||
E2A/PBX1 | Negative | |||
E2A/PBX1 | 4 | 1 | 4 | |
NEG | 1† | 41 | ||
BCR/ABL | Negative | |||
BCR/ABL | 32 | 5 | 15 | |
NEG | 7 | 35 |
Molecular groups . | Model prediction . | . | Error rate (%) . | |
---|---|---|---|---|
ALL1/AF4 | Negative | |||
ALL1/AF4 | 10 | 0 | 0 | |
NEG* | 0 | 42 | ||
E2A/PBX1 | Negative | |||
E2A/PBX1 | 4 | 1 | 4 | |
NEG | 1† | 41 | ||
BCR/ABL | Negative | |||
BCR/ABL | 32 | 5 | 15 | |
NEG | 7 | 35 |
Samples without molecular abnormalities are defined as negative.
Comparative genomic hybridization analysis revealed a gain of 1q23.
Gene expression profiles identify a set of genes highly expressed in relapsed patients. To identify genes associated with long-term outcome, we compared leukemia gene expression profiles in patients who remain in CCR with profiles from cases who subsequently relapsed after conventional therapy. Given the adverse effect of known molecular abnormalities, this analysis was done exclusively on samples that did not carry any major abnormalities. Eighty-seven probe sets, corresponding to 83 genes, were differentially expressed between patients who are still in CCR (n = 12) versus those who experienced a relapse (n = 24; Fig. 4). Virtually, all the identified genes were highly expressed in samples from patients who subsequently relapsed. Genes expressed at high levels in the relapse group could be functionally subdivided in the following categories: (a) cell motility, cell interaction, and cytoskeleton organization (SDFR1, DAAM1, IQGAP1, ADD3, TLN1, ITGB1, LGALS8, CAPZB, DNCI2, LSP1, and nexilin); (b) membrane antigens (CD79a, CD79b, CD24, CD45, CD62L, and CD85D); (c) members of the phosphatidylinositol pathway (PIP5K2B, PIK3C2B, PHKB, ITPR3, and PRKCB1); (d) resistance to chemotherapy (ASNS, CROP, and MVP); and (e) DNA modeling and response to damage (HIST1H2BD, HIST1H2AC, ATRX, and ATR; see Supplementary Table 3 for a complete list).
Comparative expression of kinase genes in pre–B-lineage acute lymphocytic leukemia. Because the BCR/ABL gene rearrangement results in leukemic transformation through the formation of a constitutively activated tyrosine kinase, we specifically compared the expression of tyrosine kinase genes in the four groups of B-lineage ALL samples. ANOVA was used to compare NEG samples with every other group (ALL1/AF4, E2A/PBX1, and BCR/ABL) using exclusively genes known to have a kinase activity (as reported by Gene Ontology: 0004713, 299 probe sets, 190 transcripts). As shown in Fig. 5A, comparison of NEG and E2A/PBX1 identified nine kinases (five tyrosine kinase: LCK, ACK1, ZAP70, BLK, and MERTK; four serine/threonine kinases: STK39, NEK4, KIAA0175, and a dual protein dual specificity protein kinase, MAP2K3) that were more highly expressed in the E2A/PBX1 samples, but only two tyrosine kinases (FLT3 and RYK) were more highly expressed in the samples without rearrangements; particularly, high levels were observed for FLT3. When compared with ALL1/AF4 samples (Fig. 5B), four kinases (two tyrosine kinases: BLK and FLT3 and two serine threonine kinases: MAP3K5/ASK1 and ULK1) were more highly expressed in samples with this rearrangement, and eight genes (six tyrosine kinases: FGFR1, FLT1, DDR1, LCK, FYN, and RYK and two serine/threonine kinases: PRKCB1 and PRKY) were more highly expressed in the NEG group. When compared with the samples with BCR/ABL rearrangements (Fig. 5C), only three tyrosine kinases were more highly expressed in the BCR/ABL group (ABL, YES1, and FYN), and FLT3 was the only tyrosine kinase that was more highly expressed in the samples without BCR/ABL.
Consistent with previous reports, the FLT3 gene was expressed at high levels in samples with ALL1/AF4 rearrangements. However, FLT3 was also highly expressed in NEG samples when compared with BCR/ABL and E2A/PBX1 samples. In fact, the high-level expression of FLT3 (>1,000 units) was found in 20 NEG samples (48% of cases). To confirm the results obtained by oligonucleotide microarrays, real-time quantitative PCR was done on 33 samples without molecular rearrangements and available material. FLT3 mRNA levels measured by microarrays were highly correlated with those measured by quantitative PCR (Pearson coefficient, −0.8).
To specifically identify kinase genes that were expressed at high levels in NEG samples, we also identified those genes within this class (using Gene Ontology: 0004713) that were expressed at relatively high levels in any of the four subsets of B-lineage ALL (Table 4). No kinases represented on the Affymetrix U95Av2 arrays were exclusively expressed at high levels in the samples without molecular rearrangements. Four kinase genes were expressed at high levels in all four subsets; eight genes were expressed at equally high levels in three groups including NEG samples. Two genes, PRKACA and AXL, were expressed at comparably high levels in the NEG and ALL1/AF4 groups, and two other kinase genes, DDR1 and PRKCB1, were highly expressed in both the NEG and BCR/ABL groups (Pearson correlation coefficient between oligonucleotide arrays and quantitative PCR: −0.68 and −0.8, respectively).
Gene symbol . | Locus link ID . | Chromosome location . | ||
---|---|---|---|---|
High expression in NEG, BCR/ABL, E2A/PBX1, and ALL1/AF4 | ||||
EFNA3 | 1944 | 1q21-q22 | ||
ARAF-1 | 369 | Xp11.4-p11.2 | ||
RAF-1 | 5894 | 3p25 | ||
MAP2K1 | 5604 | 15q22.1-q22.33 | ||
High expression in NEG, BCR/ABL, and ALL1/AF4 | ||||
CLK3 | 1198 | 115q24 | ||
CSK | 1445 | 15q23-q25 | ||
DYRK1A | 1859 | 21q22.13 | ||
ROCK1 | 6093 | 18q11.2 | ||
OSR1 | 9943 | 3p22-p21.3 | ||
MAP3K11 | 4296 | 11q13.1-13.3 | ||
High expression in NEG, BCR/ABL, and E2A/PBX1 | ||||
PSK | 51677 | 16p12.1 | ||
High expression in NEG, ALL1/AF4, and E2A/PBX1 | ||||
NTRK3 | 4916 | 15q25 | ||
High expression in NEG and ALL1/AF4 | ||||
PRKACA | 5566 | 19p13.1 | ||
AXL | 558 | 19q13.1 | ||
High expression in NEG and BCR/ABL | ||||
DDR1 | 780 | 6p21.3 | ||
PRKCB1 | 5579 | 16p11.2 |
Gene symbol . | Locus link ID . | Chromosome location . | ||
---|---|---|---|---|
High expression in NEG, BCR/ABL, E2A/PBX1, and ALL1/AF4 | ||||
EFNA3 | 1944 | 1q21-q22 | ||
ARAF-1 | 369 | Xp11.4-p11.2 | ||
RAF-1 | 5894 | 3p25 | ||
MAP2K1 | 5604 | 15q22.1-q22.33 | ||
High expression in NEG, BCR/ABL, and ALL1/AF4 | ||||
CLK3 | 1198 | 115q24 | ||
CSK | 1445 | 15q23-q25 | ||
DYRK1A | 1859 | 21q22.13 | ||
ROCK1 | 6093 | 18q11.2 | ||
OSR1 | 9943 | 3p22-p21.3 | ||
MAP3K11 | 4296 | 11q13.1-13.3 | ||
High expression in NEG, BCR/ABL, and E2A/PBX1 | ||||
PSK | 51677 | 16p12.1 | ||
High expression in NEG, ALL1/AF4, and E2A/PBX1 | ||||
NTRK3 | 4916 | 15q25 | ||
High expression in NEG and ALL1/AF4 | ||||
PRKACA | 5566 | 19p13.1 | ||
AXL | 558 | 19q13.1 | ||
High expression in NEG and BCR/ABL | ||||
DDR1 | 780 | 6p21.3 | ||
PRKCB1 | 5579 | 16p11.2 |
Abbreviation: NEG, without known molecular rearrangements; B-ALL, B-lineage ALL.
Discussion
The primary objective of our study was to establish gene expression signatures associated with known phenotypic and genetic abnormalities of adult ALL patients. Previous studies have examined gene expression in pediatric ALL (8–11, 14, 28, 29), but the genetic abnormalities that lead to malignant transformation of immature lymphoid progenitors, as well as their frequency, differ in pediatric and adult patients (4). Importantly, long-term outcome following treatment is markedly worse in adult patients. We therefore focused our attention on ALL cells obtained before therapy in a well-defined cohort of 128 adult patients enrolled in a single treatment protocol (GIMEMA 0496; refs. 30, 31). By correlating the expression of ∼12,600 probes with conventional phenotypic and molecular characteristics, we identified gene expression signatures associated with known phenotypic characteristics of adult ALL and also defined gene expression patterns associated with distinct mechanisms of leukemic transformation. Unsupervised hierarchical clustering of all ALL samples showed two distinct patterns of gene expression that correlated precisely with either T or B lineage. Although the gene expression profile of T-lineage ALL was relatively homogeneous, more detailed analysis of this data set previously identified distinct signatures associated with initial response to intensive induction chemotherapy, as well as with long-term maintenance of remission (12).
In contrast to T-lineage ALL, the profile of B-lineage ALL was very heterogeneous. The cellular origin seems relatively uniform in B-lineage ALL, but the molecular heterogeneity of this leukemia is well established. Molecular rearrangements of BCR/ABL, ALL1/AF4, and E2A/PBX1 were present in 55% of our population. The ALL1/AF4 rearrangement was uniquely associated with a high level of expression of several related transcription factors, which seem to be direct targets of MLL (14, 25, 32). Similar results have been reported by other investigators (9, 14, 25). In agreement with other reports (14, 15, 33, 34), our study also found that ALL1/AF4+ cases were characterized by overexpression of FLT3, a tyrosine kinase whose mutations represent the most common genetic rearrangements in acute myelogenous leukemia (35). Inhibitors of this tyrosine kinase have proven to be effective in vitro in ALL cells from pediatric as well as adult cases (15, 36, 37).
ALL cells from adult patients with the E2A/PBX1 rearrangement were also associated with a homogeneous gene expression pattern. In this group, it was also possible to identify a set of kinases, such as PRKCZ, BLK, and MERTK, that are consistently associated with this rearrangement in pediatric cases and may represent potential therapeutic targets (9, 10, 15, 38). Remarkably, two samples with a gain in the 1q and 1q23 region, respectively, showed a very similar profile to five samples with the E2A/PBX1 rearrangement. These results suggest that the gene expression signature associated with the E2A/PBX1 rearrangement may be primarily a reflection of increased gene dosage for PBX1. Alternatively, these cases may contain other mutations that result in the activation of PBX1 and a complex pattern of gene expression that is essentially identical to that observed in cases harboring the E2A/PBX1 rearrangement. Further characterization of the PBX1 gene in ALL samples with gains in 1q23 will be necessary to determine the mechanism responsible for deregulation of this transcription factor in these cases. Because these cases were not detected by conventional cytogenetic analysis or molecular studies, deregulation of PBX1 may occur more commonly that previously recognized in adult B-lineage ALL.
Remarkably, for both E2A/PBX1 and ALL1/AF4 rearrangements, the gene expression signatures seem very similar in both pediatric and adult ALL. This was tested directly by combining our data set with the pediatric data set described by Yeoh et al. (9). Hierarchical clustering of 300 B-lineage samples based on the genes selected by ANOVA resulted in the tight grouping of both adult and pediatric samples with MLL rearrangements. Similarly, all samples with E2A/PBX1 clustered together, regardless of patient age (Supplementary Fig. 1). These results suggest that both genetic rearrangements lead to very similar downstream effects in both adult and pediatric cases, and are, at least for the MLL rearrangements, consistent with the poor outcome of both cohorts.
ALL cells with the BCR/ABL rearrangement were also found to have a distinct but more heterogeneous gene expression profile (10, 11). In addition to ABL, other protein kinases such as YES1 and FYN are also highly expressed in these cells compared with ALL1/AF4-positive and E2A/PBX1-positive cases. The identification of additional kinases that are overexpressed in these samples is of great importance, considering that the introduction of imatinib has drastically changed the outcome of BCR/ABL+ ALL (39) and that newer BCR/ABL kinase inhibitors, such as AMN107 and BMS-354825 (40, 41), which show higher efficacy in vitro, are presently being evaluated in phase I/II clinical trials.
When analyzed as a group, ALL cases without known molecular rearrangements did not seem to have a unique gene expression signature. Although these leukemic cells were clearly distinct from blasts with rearrangements involving specific transcription factors (ALL1/AF4 and E2A/PBX1), their pattern of gene expression was more reminiscent of BCR/ABL+ cases. This was first noted in unsupervised hierarchical clustering and confirmed by direct comparison of gene expression patterns using distance maps and prediction analysis. The activation of transcription factors in both acute lymphoid (10, 11, 28, 42) and myeloid (43–45) leukemia cells consistently results in unique gene expression signatures. The observation that NEG ALL cases, when considered as a single group, do not have a distinctive signature suggests that the transforming events do not involve a single leukemogenic transcription factor. Although the similarity of expression patterns in NEG and BCR/ABL+ ALL suggests the activation of tyrosine kinases in at least some of these cases. Notably, two tyrosine kinases, FLT3 and DDR1 (38), were highly expressed in adult NEG cases. Increased expression of FLT3 has also been found in a small percentage of B-lineage ALL in children without other genetic rearrangements (46, 47). Nevertheless, this seems to be a heterogeneous group and other potential mechanisms are also likely to be involved in these cases.
Finally, comparison of ALL cases without known molecular rearrangements that remained in CCR with cases that subsequently relapsed resulted in the identification of 83 genes that were differentially expressed. This analysis was done exclusively on samples without aberrations because of the well-established adverse effect of abnormalities such as ALL1/AF4, E2A/PBX1, and BCR/ABL on outcome (1–5). Notably, virtually all of these genes had increased expression in the relapse group. Similar results have previously been reported by Teuffel et al. (48), who compared gene expression in pediatric ALL stratified by different risk groups. In this study, 75% of the identified genes had high levels of expression in the high-risk group (48). Among the genes that were associated with relapse, several have previously been associated with resistance to chemotherapy. For example, ASNS (asparagine synthetase) is directly involved in the metabolism of l-asparaginase, a compound widely used in ALL regimens. The finding that this gene is expressed at higher levels in adult patients with a poor prognosis suggests that this gene may contribute to resistance in these patients (49). Similarly, high-level expression of a set of genes involved in the phosphatidylinositol pathway (PIP5K2B, PIK3C2B, PHKB, ITPR3, and PRKCB1) indicates that this pathway can also contribute to resistance to chemotherapy.
In conclusion, these results confirm the strong influence of cell lineage derivation and of specific rearrangements on gene expression in adult ALL. Furthermore, this analysis identifies a set of kinases within defined subsets of ALL that represent potential therapeutic targets for adult patients with these leukemias. Finally, our study identifies a set of genes differentially expressed in patients without molecular aberrations with a diverse outcome. Validation studies on these transcripts are ongoing.
Grant support: NIH grant CA66996, Leukemia and Lymphoma Society Specialized Center for Research in Myeloid Leukemias, Associazione Italiana per la Ricerca sul Cancro, Ministero dell'Istruzione, Università e Ricerca, Fondo per gli Investimenti della Ricerca di Base project, Associazione per le Leucemie Acute dell'adulto “Cristina Bassi” e Fondazione Cassa di Risparmio di Genova e Imperia, and the Ted and Eileen Pasquarello Research Fund.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/).