The human CD34+/CD38/Lin cell subset, comprising ∼1–10% of the CD34+ cell population, contains few of the less primitive hematopoietic (lineage-committed) progenitor cells (HPCs) but most of the primitive in vivo engrafting (lympho-)hematopoietic stem cells (HSCs). We analyzed gene expression in CD34+/CD38/Lin cell populations isolated from normal human adult donor bone marrow, neonatal placental/umbilical cord blood, and mobilized adult donor peripheral blood stem-progenitor cells. As measured by Affymetrix microarrays, 4746 genes were expressed in CD34+/CD38/Lin cells from all three tissues. We also determined the transcriptomes of the stem cell-depleted, HPC-enriched CD34+/[CD38/Lin]++ cell population from each tissue. Comparison of CD34+/CD38/Lin (HSC-enriched) versus CD34+/[CD38/Lin]++ (HPC-enriched, HSC-depleted) cells from each tissue yielded 81 genes overrepresented and 90 genes underrepresented, common to all three of the CD34+/CD38/Lin cell populations. These transcripts, which are selectively expressed in HSCs from all three tissues, include a number of known genes (e.g., transcription factors, receptors, and signaling molecules) that might play roles in key functions (e.g., survival, self-renewal, differentiation, and/or migration/adhesion) of human HSCs. Many genes/transcripts of unknown function were also detected by microarray analysis. Serial analysis of gene expression of the bone marrow HSC and HPC populations confirmed expression of most of the overrepresented transcripts for which reliable serial analysis of gene expression tags were detected and additionally suggested that current microarrays do not detect as many as 30% of the transcripts expressed in HSCs, including a number of previously unknown transcripts. This work is a step toward full definition of the transcriptome of normal human HSCs and may identify new genes involved in leukemogenesis and cancer stem cells.

A small number of in vivo engrafting (lympho-)hematopoietic stem cells (HSCs), present in bone marrow (BM), placental/umbilical cord blood (CB), or growth factor-mobilized peripheral blood stem-progenitor cells (PBSCs), give rise to progressively more lineage-committed hematopoietic progenitor cells (HPCs), which in turn produce all of the mature blood and immune cells and probably endothelial cells as well. In humans, most HSCs and HPCs express the CD34 phosphoglycoprotein protein and mRNA. In vivo-engrafting HSCs comprise ≪1% of the total CD34+ cell population. The CD34+/CD38/Lin cell population is highly (∼100-fold) enriched in these in vivo-engrafting HSCs, as compared with the total CD34+ cell population. Other markers such as efficient efflux pumping of rhodamine or Hoescht dyes or CD133, which enrich for primitive hematopoietic stem-progenitor cell (HSPC) subpopulations, have also been described but are much less extensively characterized for human as opposed to mouse HSPCs, with regard to HSC function such as repopulation and engraftment ability (1, 2, 3, 4, 5).

A significant body of work has been reported on the gene expression of mouse HSPCs. For example, initial studies used cDNA/reverse transcription-PCR (RT-PCR)-based subtraction libraries of transcripts expressed in mouse fetal liver (6) or BM (7) HSPCs and found hundreds to thousands of transcripts overrepresented in HSPCs, as compared with more mature hematopoietic cells.

Park et al.(8), using a subtractive microarray approach to compare mouse HSC-enriched Thy1.1loc-kit+Sca-1hiLin-/lo cells to HPC-enriched populations, found that ∼5000 cDNA clones were differentially expressed between the two populations. Terskikh et al.(7) used nylon cDNA arrays, containing a limited set of 1176 genes, to examine gene expression of mouse HSCs, common myeloid, granulocyte-macrophage, megakaryocyte-erythrocyte, and lymphoid progenitors, and pro-B, and pro-T cells. Although this study examined only a handful of genes, the authors showed that a number of hematopoiesis-specific genes were expressed by HSCs. The expression of these genes decreased in progressively more committed HPCs, which at the same time, began to express lineage-specific genes. Akashi et al.(9) performed a similar study with 24,000 gene oligonucleotide arrays. In addition to confirming the prior study, they found that HSCs expressed a number of nonhematopoietic genes. However, because of the difficulties of isolating numbers of highly purified HSC-enriched subpopulations sufficient to produce the quantities of RNA needed for microarray hybridization, to date, only a handful of studies have attempted similar gene expression analyses with human HSPCs. Instead, most previous microarray analyses of human HSPCs have had to use relatively unpurified, total CD34+ cell preparations (only ≪1% of which are HSCs), rather than more highly HSC-enriched subpopulations of CD34+ cells. As an example, Steidl et al.(10) examined the expression of 1185 genes from BM and PBSC (total) CD34+ cells. They found 65 genes differentially expressed, some of which may explain the higher levels of cell cycling in CD34+ cells from BM, as compared with PBSCs. Although these studies defined genes expressed in the total CD34+ cell population, these analyses may have missed expression of key human HSC genes or misinterpreted their expression in HSCs versus more mature HPCs. In other words, these studies most likely identified genes expressed principally in HPCs, not HSCs. In addition, only relatively small-scale microarray gene expression analyses have been reported (generally <5,000–12,000 known genes), further limiting the impact of these studies of human HSPCs.

Two recent studies have begun to define a general gene expression phenotype for stem cells. Ramalho-Santos et al.(11) examined the transcriptomes of side population mouse BM Kit+LinSca-1+ HSC-enriched cells, mouse neurospheres, and a mouse embryonic stem cell line. Four transcripts were expressed in all three stem cell types but not in more mature cell types. An additional 212 transcripts were highly enriched in the three types of stem cells, but these genes were also detected in more mature cell types. Ivanova et al.(12) examined the transcriptomes of mouse adult BM Kit+LinSca-1+ Rholo, mouse fetal liver Kit+LinSca-1+ AA4.1+, and human fetal liver CD34+/CD38/Lin HSC-enriched cell populations, as well as mouse neurosphere side population cells and a mouse embryonic stem cell line. A total of 322 transcripts was enriched in all these HSPC populations and 283 transcripts in all three stem cell types. Interestingly, both these groups found that approximately half of the genes expressed in the stem cell-enriched populations had unknown function or were expressed sequence tags (ESTs). Yet, similar to previous work with HSPCs, these investigations studied mainly mouse cells, examining only one human cell population. In addition, comparison of the lists of stem cell-overexpressed genes from these two studies reveals that only 6 genes common to both lists (13, 14, 15).

To further elucidate the gene expression and biology of human HSCs, we have focused on three clinically relevant tissue sources of adult human HSPCs. We isolated highly enriched CD34+/CD38/Lin and CD34+/[CD38/Lin]++ cell populations from normal human BM, CB, and PBSC preparations. CD34+/CD38/Lin cells from each of these tissues are capable of fully reconstituting lymphohematopoiesis by in vivo engraftment assays (2, 5, 16, 17, 18). In contrast, CD34+/[CD38/Lin]++ cells are known to be depleted of in vivo-engrafting HSCs and enriched in later HPCs. Therefore, we postulated that by comparing the gene expression profiles of the CD34+/CD38/Lin HSC-enriched population to those of the complementary CD34+/[CD38/Lin]++ HPC-enriched but HSC-depleted population from each tissue source (intersection analysis), we would identify a set of genes that might include candidate regulators involved in the survival, self-renewal, differentiation, and/or migration/adhesion capacities of human HSCs, as well as genes that may be targets in cancer stem cells, which give rise to blood cancers. Our principal gene expression analysis was carried out using the Affymetrix U133 chip set, containing 45,102 individual genetic targets (including a number of known genes/transcripts, predicted genes, and ESTs). We found 81 genes that were overrepresented and 90 genes underrepresented in the CD34+/CD38/Lin populations from all three tissues. To additionally confirm our comparisons and to possibly identify completely unknown transcripts and those missed by microarrays, we performed serial analysis of gene expression (SAGE; Ref. 19, 20, 21) on the BM HSC and HPC subpopulations. SAGE confirmed expression levels of 94% of the overrepresented transcripts. In addition, SAGE detected ∼58% more transcripts than the oligonucleotide microarrays, a large proportion of which were expressed only in the HSC-enriched population. Many of the tags detected by SAGE as overexpressed in HSC did not map to any known transcript or EST.

Isolation of CD34+/CD38/Lin and CD34+/[CD38/Lin]++ Cell Populations

Cryopreserved human CB CD34+ cells were purchased from AllCells (Berkeley, CA). Cryopreserved human cadavaric BM and PBSC CD34+ cells from normal adult donors were obtained from the National Heart, Lung, and Blood Institute Program of Excellence in Gene Therapy, Hematopoietic Cell Processing Core (Fred Hutchinson Cancer Center, Seattle, WA). Each BM sample was a pool of cells from 2 donors. One PBSC sample was a pool of 5 donors, the other a pool of 3 donors. The CB sample was a pool from >80 donors. Previous results in our laboratory have shown that an outlier in gene expression occurs at a frequency of <1 in 10–12 normal donors. (unpublished data). Therefore, duplicate samples consisting of multiple donor pools were used to minimize the possibility that a rare outlier would affect the differential gene expression results. All human cells had been obtained with informed consent under Institutional Review Board-approved protocols and were provided without data identifying the donors. Five × 107 frozen total CD34+ cells were thawed and viable cells obtained by Ficoll-Hypaque density gradient centrifugation, resulting in 1.8–2.8 × 107 viable cells/sample. Viable cells were then stained with phycoerythrin-conjugated antihuman CD34 monoclonal antibody and a mixture of FITC-conjugated monoclonal antibodies specific for human CD38 and the following lineage markers: CD3 (T-lymphoid cells); CD5 (T-lymphoid cells); CD10 (lymphoid progenitor cells); CD13 (mature and progenitor-precursor macrophage/monocytic and granulocytic cells); CD14 (monocyte/macrophages); CD16 (granulocytes, natural killer cells, and monocyte/macrophages); CD19 (mature and early B-lymphoid cells); CD33 (mature and progenitor-precursor macrophage/monocytic, and granulocytic cells); CD41a (mature and progenitor-precursor platelets, and megakaryocytic cells); CD45RA (B-lymphoid cells, some T-lymphoid cells, some mono/granulocytic progenitor-precursor cells); CD66B (granulocytic cells); CD71 (erythroid progenitor-precursor cells, activated lymphoid cells); and CD235a (glycophorin A; mature and precursor erythroid cells). All monoclonal antibodies were purchased from BD Biosciences-PharMingen (San Diego, CA), except CD13 (Dako, Glostrup, Denmark). Cells were isolated by fluorescence-activated cell sorting (FACS) using a FACSVantage flow cytometer (Becton Dickinson, Franklin Lakes, NJ).

Purification of Total RNA

After FACS, cells were pelleted by centrifugation at 800 × g in RNase-free, 1.5-ml siliconized microcentrifuge tubes (Ambion, Austin, TX). Pellets were disrupted by vigorous pipeting in 100 μl of Trizol Reagent (Invitrogen, Carlsbad, CA)/106 cells. This solution was transferred to 1.5-ml PhaseLoc-Heavy tubes (Eppendorf, Hamburg, Germany), 20 μl of chloroform were added/100 μl of Trizol, and the tubes were centrifuged at maximum speed (∼20,000 × g) in a microcentrifuge. The aqueous phase containing RNA was removed and additionally purified using the RNeasy Mini-Kit (Qiagen, Valencia, CA) following the manufacturer’s RNA Clean-up protocol with the optional On-column DNase Treatment; the only modification to the Qiagen protocols was that numbers of washes for all washing steps were doubled.

Analysis of Gene Expression

Microarray Analysis of BM, CB, and PBSCs.

Five hundred ng of total RNA from each sample were double linear amplified with the ENZO BioArray High Yield RNA Transcript Labeling kit and the GeneChip Eukaryotic Small Sample Target Labeling Assay, Version II protocol (Affymetrix, Santa Clara, CA) to produce target for hybridization to Affymetrix U133 chips. Although 2× linear amplification of RNA is a commonly used and reliable method, we tested the fidelity of the method in preserving relative gene expression levels. RNA from total CD34+ PBSCs was compared with a reference RNA prepared from a control cell line. Five μg of each RNA were tested after standard 1× amplification, and 500 ng of each were tested after 2× amplification by hybridization to the U133A chip. Fold change comparisons of each condition were then performed with GeneSpring 5.0.2 software (Silicon Genetics, Redwood City, CA). Although there were minor changes in the absolute magnitude of change for a small number of genes, the directionality of change was different in <0.001% of the ∼4000 transcripts scored as “Present” (unpublished data).

BM and PBSC samples were tested in biological duplicate (i.e., samples from two different donor pools). The CB sample was tested in technical duplicate (i.e., same RNA donor pool analyzed twice). Initial quality assessments of duplicate samples were analyzed using Affymetrix MAS 5.0 software. In addition to the internal chip normalizations performed with Affymetrix chips, the U133 chips contain a set of 100 normalization genes (probe sets 200,000–200,099), which have been shown to be stably expressed across many different cell types; these normalization genes were used for additional normalization of all samples. Genespring 5.0.2 software was used for statistical analysis of differential transcript expression. In addition to the parametric statistical measures of gene expression provided by GeneSpring 5.0.2 and Affymetrix MAS 5.0, we used the nonparametric hypothesis-based analysis of microarrays method as a secondary filter applied to the experiment in the selection of overrepresented genes (Refs. 22, 23; see supplemental text for a full explanation of hypothesis-based analysis of microarrays). Filemaker Pro 6.0 software (Filemaker, Inc., Santa Clara, CA) was used to build a gene expression database, to compare gene expression patterns, and to classify genes by functional category. Gene/transcript annotation data were obtained by query of the Unigene,6 Locus Link,7 On-line Mendalian Inheritance in Man,8 and Kyoto Encyclopedia of Genes and Genomes molecular pathway information9 databases (24). Percent identity between cell populations was calculated by the formula: shared genes in population A and B (and C)/all genes expressed by population A or B (or C).

SAGE.

Eight hundred ng of total RNA from the BM HSC-enriched and HPC-enriched populations were analyzed by Micro-SAGE. Micro-SAGE was carried out with the iSAGE kit (Invitrogen) and modified to follow the Micro-SAGE protocol (25). Sequencing of SAGE 10-mer tags of 2304 clones from each library was carried out by Agencourt Bioscience Corporation (Beverly, MA). SAGE tags were enumerated, annotated (with both the Reliable- and Full-SAGE tag mappings; see web site for a full description of these methods),10 and normalized with SAGE 2000 version 4.5 software (Invitrogen). Filemaker Pro 6.0 was used to build a gene expression database from the tag data. Transcripts with a SAGE tag count of 1 were excluded from analysis because erroneous tag sequences can be generated by sequencing errors at a rate of ∼1/500 tags. Because the odds of having two identical erroneous tags detected is ∼1/100,000 tags, we considered any gene expressed at ≥2 tags to be present by SAGE. There is no consensus statistical method (26, 27, 28, 29) for addressing significant differences of expression between SAGE libraries; we chose the method of Man et al.(26) to calculate P values for expression differences between the libraries.

RNA sequences for differentially expressed transcripts were downloaded from GenBank.11 Multiple PCR primers for each transcript were designed with Primer 3.0 (Whitehead Institute, Massachusetts Institute of Technology, Boston, MA)12 and tested against a 2-fold dilution series of test sample prepared by mixing cDNA from unsorted CD34+ cells from BM, CB, and PBSCs. We had previously determined that β-actin is an optimal normalization gene for calibration of quantitative RT-PCR (qRT-PCR) results among different CD34+ cell populations (unpublished results). Two-step RT-PCR was carried out by first producing cDNA with a modified version of the Super-SMART PCR cDNA kit (Clontech, Palo Alto, CA). Second, qRT-PCR was carried out on a Bio-Rad iCycler (Bio-Rad, Hercules, CA) with iQ SYBR-green Supermix (Bio-Rad). Only primer sets that produced a single product band (as shown by both agarose gel and melt-curve analysis) and that resulted in doubling efficiencies of ∼100% were used for additional analyses. This was imperative because the -ΔΔCt method (30) was used to calculated fold difference in gene expression.

CD34+/CD38/Lin and CD34+/[CD38/Lin]++ Cell Isolation.

A total of 1.8–2.8 × 107 viable CD34+ cells/sample was FACS sorted. The average RNA content (∼1.5 pg/cell) of both CD34+/CD38/Lin and CD34+/[CD38/Lin]++ cells dictated a requirement for ∼106 FACS-sorted cells/subpopulation to yield sufficient RNA for transcriptome analysis. Therefore, for these experiments, the 5–10% of cells with the lowest and the highest intensity of FITC fluorescence (corresponding to expression of the CD38/Lin marker mixture) were sorted by FACS as the CD34+/CD38/Lin (HSC-enriched) and CD34+/[CD38/Lin]++ (HPC-enriched, HSC-depleted) cell preparations, respectively. This resulted in 8% of the cells from CB (a single FACS sort), 8.5% from BM (average of two sorts), and 9% from PBSCs (average of two sorts) being isolated as the CD34+/CD38/Lin and CD34+/[CD38/Lin]++ cell populations. CB cells yielded 2 μg of RNA for the CD34+/CD38/Lin and 2.3 μg for the CD34+/[CD38/Lin]++ cells: BM (average of two samples; 1.6 and 1.6 μg), and PBSCs (average of two samples; 1.5 and 1.1 μg), respectively. FACS reanalyses of the starting CD34+ cells and the FACS-sorted cells (shown for one of the FACS sorts for each tissue in Fig. S1) demonstrated that the purified cell populations were highly enriched for the specified phenotypes.

The Transcriptome of CD34+/CD38/Lin Cells by Oligonucleotide Microarray Analysis.

The oligonucleotide microarray gene expression results for each of the three tissues were filtered with MAS 5.0 software to select only those genes scored as “Present” in the CD34+/CD38/Lin populations. A total of 11,849 transcripts was expressed by at least one of three HSC populations. A total of 6,366 transcripts was detected in the CD34+/CD38/Lin population from BM, 11,075 from CB, and 6,669 from PBSCs (Fig. 1,A). A total of 4746 of these genes was expressed in the CD34+/CD38/Lin population of all three tissues; this group included 2943 transcripts of known function, 1310 uncharacterized transcripts or ESTs, and 493 predicted transcripts (Fig. 1 B). Supplemental Database S2 lists all transcripts detected for each tissue and those expressed in all three tissues. At the global gene expression level, the BM and CB populations share 50.4% identity, CB and PBSCs share 54.9% identity, and BM and PBSCs share 59.7% identity. Overall, the three populations share 40.1% identity at the level of transcriptome phenotype.

Microarray Analysis of the HSC-Enriched (CD34+/CD38/Lin) Transcriptome Compared with the HPC-Enriched CD34+/[CD38/Lin]++ Transcriptome.

For each of the three tissues, differential expression lists of the microarray results were generated, using GeneSpring 5.0.2 software, of transcripts that were ≥2-fold differentially expressed and met the 90% confidence level, by Student’s t test, as significantly different in the CD34+/CD38/Lin HSC cell-enriched population, as compared with the CD34+/[CD38/Lin]++ HPC-enriched cell population from the same tissue (Figs. 2,A and 3,A). The CD34+/CD38/Lin population from BM overexpressed 1190 transcripts and underexpressed 1159 transcripts, from CB overexpressed 889 and underexpressed 939 transcripts, and from PBSCs overexpressed 506 and underexpressed 519 transcripts. Intersecting these results for all three tissues yielded 87 Affymetrix probe sets (representing 81 genes) comparatively overrepresented (Table 1) and 95 Affymetrix probe sets (representing 90 genes) underrepresented (Table S4) in the CD34+/CD38/Lin HSC-enriched compared with the C34+/[CD38/Lin]++ HPC-enriched population. These genes were also independently selected by the nonparametric, hypothesis-based analysis of microarrays method (Database S6). Functional annotation of the HSC-overrepresented genes (Fig. 2,B) yielded 50 genes of known/predicted function and 30 genes of unknown function (including 12 ESTs and 7 predicted proteins). Annotation of the HSC-underrepresented genes yielded 59 genes of known function and 31 genes of unknown function including 15 ESTs and 8 predicted proteins (Fig. 3 B).

SAGE of BM HSC-Enriched and HPC-Enriched Populations.

The BM populations showed the greatest differences in gene expression between the HSC- and HPC-enriched populations. Therefore, this population was chosen for SAGE. A total of 84,107 tags was detected from the HSC population library and 87,416 tags from the HPC population library. Herein, we focused on only the genes identified as HSC-overexpressed genes by the microarray analyses. SAGE produced tags for 65 of the 81 transcripts, which were overexpressed in HSCs by microarray analysis (Table 1). SAGE confirmed overexpression of 61 (94%) of these 65 genes found overexpressed by the HSC population. For 4 (6%) transcripts, SAGE showed similar expression in the HSC versus HPC population. SAGE did not detect nonredundant tags for 16 (20%) of the 81 transcripts (Table 1), making it impossible to determine expression of these transcripts by SAGE.

Overall, SAGE identified 10,078 transcripts expressed by BM HSC-enriched cells (Database S7a), ∼58% more transcripts than the 6366 detected by microarray analysis. In addition, 2916 transcripts were overexpressed at least 2-fold in the BM HSC-enriched population by SAGE (Database S7b), compared with the 1190 transcripts identified as HSC overexpressed by microarray analysis. Of these HSC-overexpressed transcripts identified by SAGE, 2008 were detected exclusively in the HSC population (i.e., they were completely absent in the HPC-enriched population). A total of 646 tags detected by SAGE as expressed in HSCs (Database S7c) did not map to any known transcript or EST; of these, 408 tags were overexpressed in HSCs, and 238 of these 408 tags were detected exclusively in the HSC-enriched population (i.e., not detected in HPCs).

Confirmation of Gene Expression Results by qRT-PCR.

Twenty-nine genes were chosen from the list of microarray HSC-overrepresented genes (Table 1) and 19 genes from the list of HSC-underrepresented (Table S4) genes for confirmation of fold difference by relative qRT-PCR. Transcripts were chosen to cover the entire observed range of fold differences from 2-fold to the maximum of 60-fold. Expression levels of these 48 transcripts were tested in HSC- and HPC-enriched populations from all three tissues for a total of 144 independent qPCR tests. A total of 141 of these 144 qRT-PCR assays confirmed the observed differential expression in the CD34+/CD38/Lin HSC-enriched compared with the CD34+/[CD38/Lin]++ HPC-enriched cell population; there were only 3 transcripts where differential expression by microarray was not confirmed by qRT-PCR for all three tissues (Figs. S2A and S2B). In each of these 3 cases, the analyses disagreed in only one tissue of the three tissues (and even in this one tissue, there was a difference in gene expression, but it did not meet the arbitrary 2-fold cutoff). Therefore, an exceptional level of 98% qRT-PCR confirmation was achieved for microarray results in this study. Indeed, the magnitude of fold difference detected by qRT-PCR tended to be greater than those found by the microarrays for several of genes (e.g., CRFBP, LAGY, EDM, and HTM4), most likely due to greater sensitivity of PCR, and agreed very closely for most others (e.g., CD52, HERMES, HLF, and FKSG14).

To date, gene expression studies of human HSPCs have focused on either the total CD34+ cell population and/or have compared purified subsets of CD34+ cells from only a single tissue. Although these studies have added to the knowledge base concerning HSPCs, HSCs are only a tiny subset of the total CD34+ cell population. Therefore, analyses of total CD34+ cells might not detect genes expressed selectively by HSCs, especially genes expressed at relatively low levels, but would detect mostly genes expressed by committed progenitor cells. For example, in a recent investigation that analyzed the total CD34+ cell population by SAGE (31), myeloperoxidase was one of the genes found to be expressed in total CD34+ cells. However, myeloperoxidase is expressed only in committed phagocytic precursors and phagocytes, not in undifferentiated HSCs (32, 33, 34).

Two problems of analyzing subpopulations of CD34+ cells from a single tissue quickly become evident. First, comparisons of cells within a given tissue will probably identify not only genes important for HSC functions but also a large number of genes expressed due to the general physiology of the HSPCs residing within that particular tissue. Second, although CD34+/CD38/Lin cells isolated from BM, CB, or PBSCs contain HSCs capable of fully reconstituting hematopoiesis, they are still a heterogeneous population of cells (1, 2, 3, 4, 5), which also contains some very early HPCs. Therefore, we postulated that comparing the gene expression profiles of the purified CD34+/CD38/Lin cell population to that of the CD34+/[CD38/Lin]++ population from each of these three tissues and then determining the genes identified as differentially expressed by the HSC in all three tissues (intersection analysis) would allow us to focus more clearly on genes likely to be involved in HSC versus HPC function; because all three tissue populations contain HSCs, which engraft after bone marrow transplantation, those transcripts differentially expressed in the HSC-enriched populations from all three tissues should include all transcripts vital to HSCs, whereas those genes expressed only in one tissue type, as well as those due to differences in the heterogeneous makeup of the CD34+/CD38/Lin population, would tend to be filtered out.

Intersection analysis identified 4746 transcripts expressed by the HSC-enriched populations from all three tissues (Fig. 1,A). These genes encoded transcription factors, signaling/receptor proteins, and other molecules with known functions. In concordance with the observations of other stem cell studies (6, 7, 8, 11, 12, 31), a plurality of the HSC-expressed genes had unknown function were ESTs or encoded hypothetical proteins. Our list of genes expressed in CD34+/CD38/Lin cells (Supplemental Database S2) includes a number of genes previously shown to be involved in hematopoiesis (e.g., KIT, FLT3, GATA-2, GATA-3, p27, HoxA5, and HoxA9), as well as markers for HSCs (e.g., CD34, MDR2). Many genes known (or expected) to be expressed only by HPCs or more mature blood or immune cells (e.g., myeloperoxidase, CD38) are not present in this HSC list but are detected in the HPC population (Supplemental Database S4). These indicate stringent purity of the HSC and HPC populations, which we examined, as suggested by the flow cytometric reanalysis of the purified cell populations (Fig. S1). Genes expressed by only one population, and many of those expressed by two populations should fall within the following categories: (a) genes expressed due to tissue-specific microenvironment; (b) genes differentially expressed because of different proportions of HSCs to non-HSCs (i.e., very early progenitor) cells within the CD34+/CD38/Lin population; or (c) genes falsely scored positive by the Affymetrix chip system. Intersection analysis is designed to exclude all of these conditions. We generated lists of genes that were differentially expressed (with a statistically significant 2-fold change) in the microarray analyses of the CD34+/CD38/Linversus the CD34+/[CD38/Lin]++ population (Supplemental Databases S2 and S3). Approximately 2200 genes were differentially overexpressed by any one of the HSC populations. In contrast to these large numbers of differentially expressed genes in any single tissue, only 81 genes were overrepresented (Fig. 2,A, Table 1), and 90 genes were underrepresented in the intersection (Fig. 3 A, Table S4) of HSC-enriched populations. The qRT-PCR and SAGE results provide extremely high confirmation rates, indicating that the intersection analysis was highly selective for identifying actual differentially expressed genes.

The HSC population overexpressed a number of known genes that may be involved in the seminal characteristics of the stem cell. A handful of examples are included: Kruppel-like factors 2 and 4 are thought to be regulators of cellular quiescence, maintenance, and cell cycle arrest (35). CEBPB has been shown to control the expression of a number of cytokines in immune cells (36) and is involved in cell survival and tumorigenesis associated with the RAS oncogene (37). The recently annotated human immune-associated nucleotide 2 protein is a putative control protein of GDP/GTP-signaling proteins (38) and may also play a role in self-renewal by limiting the effects of growth factor-directed differentiation. We found two HOX genes (39, 40) overexpressed. HoxA3 is involved in formation of the nervous system (41, 42), pharyngeal glandular organs (43), and thymic epithelial cells (44) but has not been studied in hematopoiesis. HoxB6 is expressed in HSPCs (45, 46, 47), is involved in differentiation of the granulocytic lineage (48), and may suppress development of erythroid progenitors (49).

In addition to confirming the microarray results, SAGE results revealed three additional interesting findings. First, SAGE detected ∼30% more genes expressed by the HSC-enriched population than were detected by microarray, most likely because of low copy number or high probe set background (the latter would cause the MAS 5.0 software to make an “Absent” call for that particular transcript). We scored a transcript tag as “Present” only if it occurred at a frequency of two tags or greater. Although unlikely, it is possible that a small number of transcripts are false positives because of sequencing errors during tag detection. In addition, it is possible that a small percentage of the detected tags identify splice variants of the same gene. A large proportion of the transcripts identified by SAGE were expressed exclusively within the HSC population, many times more than were exclusively expressed within the HPC-enriched population. This tends to confirm the observation of Terskikh et al.(7) and Akashi et al.(9) who showed that hematopoietic genes expressed by mouse HSCs diminish during differentiation to early and late HPCs, which begin to express lineage-specific genes. Our data with human populations tend to confirm this finding for the equivalent human genes, e.g., HoxA5, HoxA9, Bmi-1, RER, Tyk2, JAM1, API-1, and API-2, although a number of these genes were not differentially expressed (at >2-fold between the HSC and HPC populations) in all three tissues. Also, a current theory to explain the multipotent and possible trans-differentiation potential of stem cells is that they exist in an open epigenetic state; this would allow the stem cell to develop toward any lineage by transcriptional up-regulation of a lineage-specific set of genes without chromatin remodeling. Gene silencing would occur in maturing cells, resulting in a more restricted transcriptome. Akashi et al.(9) suggest that HSCs have an open chromatin structure because they appear to weakly express a number of genes normally associated with nonhematopoietic cell types. Our overall expression data (Fig. 1 A, Supplemental Database S1) support this theory because a number of nonhematopoietic genes are detected, e.g., neuronal-associated genes ANA/BTG3, GIF/TIEG, and SMN1; endothelial-associated genes ANG-1 and PROCR/EPCR; liver-associated genes CYP2C38, CPT1, and aldo-keto reductase 1; and muscle-associated genes MEF2and NRAP. Furthermore, fetal CB HSCs (hypothesized to be more a more primitive population than adult BM or PBSC HSCs) expressed many more genes than adult BM or PBSC HSCs. This considerable number of additional transcripts beyond those identified by the microarrays may be involved in HSC biology. Finally, we found 646 tags expressed by the HSC-enriched population that did not correspond to any known gene or EST. This suggests that cells within the HSC population express a large number of completely novel transcripts, which is ∼6% of all of the transcripts that they expressed. One caveat to these numbers is that some of the unidentified tags may identify the same transcript, although the number of transcripts with multiple tags would be expected to comprise only a small percentage of the tags detected.

Genes found to be differentially overexpressed by independent laboratories should be the highest priority candidate genes to additionally audition for key roles in HSC biology. Readers may use full databases (supplementary data) to perform their own meta-analyses, but to illustrate, we performed a limited meta-analysis of microarray results (50, 51). We compared the list of 81 genes overrepresented in our human CD34+/CD38/Lin cells to the reported findings for HSC-enriched populations in two recent studies that examined the transcriptomes of several types of stem cells, including mouse BM Kit+LinSca-1+ SP HSPCs and human CD34+CD38Lin fetal-liver HSPCs, mouse Kit+LinSca-1+AA4.1+ fetal-liver HSPCs, and mouse Kit+LinSca-1+Rhodaminelo BM HSPCs (11, 12). Only the transcription factor GATA3 was overrepresented in all four datasets. Three transcription factors (HLF, MDS1, and CEBPB), one RNA-processing protein (RBPMS/HERMES), and one cell surface receptor (MPL/CD110) were found in our own results plus two of the other datasets (Table 2).

Recently, it has been proposed that cancer is a stem cell disease (2, 52, 53, 54, 55, 56, 57). Most cancers may arise from self-renewing (stem) cells. Alternatively, cancer cells may mutationally gain certain characteristics of stem cells, particularly the ability to self-renew. A number of the genes identified in this study have already been implicated in hematological malignancies; CD110/MPLis a good example. Overexpression of CD110 has been demonstrated to immortalize HSPCs. Finally, a number of studies have shown that leukemias arise from cells with HSC characteristics (2, 52, 53, 57). Presumably, some of the other overexpressed genes, including the known and the newly identified genes maybe be involved in carcinogenesis, especially leukemogenesis. A number of studies have shown that at least some solid cancers are stem cell diseases. Hemmati et al.(55) found a subpopulation of brain tumor cells that resemble neural stem cells that appear to self-renew. Al-Hajj et al.(56) describe similar findings in breast tumors in that a protein expression-defined subset of tumor stem cells were the only cells able to reconstitute the tumor. Thus, identification of the full spectrum of genes involved in the biology of the HSCs is critically important for the study of leukemia and likely other cancers. Our rigorous examination of the transcriptomes of HSCs from all three of the major hematopoietic tissue sources should lead to identification of novel target genes involved in the development of hematopoietic and other malignancies.

Grant support: Children’s Cancer Foundation.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Note: Supplemental materials containing all of the data cited in this study, as well as, all of the confirmatory studies can be downloaded at http://203.200.58.139/ftp; The Johns Hopkins University holds patents on CD34 monoclonal antibodies and related inventions. Dr. Civin is entitled to a share of the sales royalty received by the University under licensing agreements between the University, Becton Dickinson Corporation, and Baxter HealthCare Corporation. The terms of these arrangements have been reviewed and approved by the University in accordance with its conflict of interest policies.

Requests for reprints: Curt I. Civin, BBCRB 2M44, Kimmel Cancer Center at Johns Hopkins, 1650 Orleans Street, Baltimore, MD 21231. Phone: (410) 955-8816; Fax: (410) 955-8897; E-mail: [email protected]

6

Internet address: http://www.ncbi.nlm.nih.gov/UniGene/.

7

Internet address: http://www.ncbi.nlm.nih.gov/LocusLink/.

8

Internet address: http://www.ncbi.nlm.nih.gov/omim/.

9

Internet address: http://www.genome.ad.jp/kegg/kegg2.html.

10

Internet address: http://www.sagenet.org.

11

Internet address: http://www.ncbi.nlm.nih.gov/GenBank/.

12

Internet address: http://www-genome.wi.mit.edu/cgi-bin/primer/primer3_www.cgi.

13

Internet address: http://www.geneontology.org/.

Fig. 1.

All transcripts expressed in bone marrow (BM), cord blood (CB), and/or peripheral blood stem-progenitor cells (PBSCs) CD34+/CD38/Lin populations. Gene expression results from the U133 A and B chips were analyzed with Affymetrix MAS 5.0 software. Only transcripts scored as “Present” (i.e., detectably expressed) in CD34+/CD38/Lin cells from both the duplicate samples for each tissue source were included. The (1A) Venn diagram depicts the numbers of genes expressed in one, two, and/or all three tissues. Lists of genes with levels of their tissue expression are provided in separate worksheets in Table S2 (Microsoft Excel spreadsheets): shown are transcripts (A) expressed in all three tissues; transcripts expressed in (B) BM, (C) CB, or (D) PBSCs; (E) transcripts expressed in BM and CB, (F) in BM and PBSCs, or (G) in CB and PBSCs. The functional categorization, based on the Gene Ontology Consortium13 classification system, of the 4746 common transcripts is shown in Fig. 2 B.

Fig. 1.

All transcripts expressed in bone marrow (BM), cord blood (CB), and/or peripheral blood stem-progenitor cells (PBSCs) CD34+/CD38/Lin populations. Gene expression results from the U133 A and B chips were analyzed with Affymetrix MAS 5.0 software. Only transcripts scored as “Present” (i.e., detectably expressed) in CD34+/CD38/Lin cells from both the duplicate samples for each tissue source were included. The (1A) Venn diagram depicts the numbers of genes expressed in one, two, and/or all three tissues. Lists of genes with levels of their tissue expression are provided in separate worksheets in Table S2 (Microsoft Excel spreadsheets): shown are transcripts (A) expressed in all three tissues; transcripts expressed in (B) BM, (C) CB, or (D) PBSCs; (E) transcripts expressed in BM and CB, (F) in BM and PBSCs, or (G) in CB and PBSCs. The functional categorization, based on the Gene Ontology Consortium13 classification system, of the 4746 common transcripts is shown in Fig. 2 B.

Close modal
Fig. 2.

Genes overrepresented in the CD34+/CD38/Lin cell population compared with the CD34+/[CD38/Lin]++ population from bone marrow (BM), cord blood (CB), and/or peripheral blood stem-progenitor cells (PBSCs). Results from the U133 A and B chips were subjected to statistical analysis with GeneSpring 5.0.2 to generate P values for the CD34+/CD38/Linversus the CD34+/[CD38/Lin]++ populations from each tissue. Only genes meeting the 90% confidence level for fold difference in transcript expression and greater than ≥2-fold overrepresented in the CD34+/CD38/Lin population are included. The Venn diagram depicts the numbers of genes overrepresented in the CD34+/CD38/Lin population from one, two, and/or all three tissues. Genes overrepresented in the CD34+/CD38/Lin preparations from (A) all three tissues are listed in Table 1. The following lists of genes and associated expression values are provided in separate worksheets within Table S3 (Microsoft Excel): genes overrepresented in the CD34+/CD38/Lin populations from (B) BM, (C) CB, or (D) PBSCs; from (E) BM and CB; from (F) BM and PBSC; or from (G) CB and PBSC. The functional categorization of the 81 genes overexpressed in the CD34+/CD38/Lin populations from all three tissues, based on the Gene Ontology Consortium classification system, is shown in B.

Fig. 2.

Genes overrepresented in the CD34+/CD38/Lin cell population compared with the CD34+/[CD38/Lin]++ population from bone marrow (BM), cord blood (CB), and/or peripheral blood stem-progenitor cells (PBSCs). Results from the U133 A and B chips were subjected to statistical analysis with GeneSpring 5.0.2 to generate P values for the CD34+/CD38/Linversus the CD34+/[CD38/Lin]++ populations from each tissue. Only genes meeting the 90% confidence level for fold difference in transcript expression and greater than ≥2-fold overrepresented in the CD34+/CD38/Lin population are included. The Venn diagram depicts the numbers of genes overrepresented in the CD34+/CD38/Lin population from one, two, and/or all three tissues. Genes overrepresented in the CD34+/CD38/Lin preparations from (A) all three tissues are listed in Table 1. The following lists of genes and associated expression values are provided in separate worksheets within Table S3 (Microsoft Excel): genes overrepresented in the CD34+/CD38/Lin populations from (B) BM, (C) CB, or (D) PBSCs; from (E) BM and CB; from (F) BM and PBSC; or from (G) CB and PBSC. The functional categorization of the 81 genes overexpressed in the CD34+/CD38/Lin populations from all three tissues, based on the Gene Ontology Consortium classification system, is shown in B.

Close modal
Fig. 3.

Genes underrepresented in the CD34+/CD38/Lin cell population compared with the CD34+/[CD38/Lin]++ population from bone marrow (BM), cord blood (CB), and/or peripheral blood stem-progenitor cells (PBSCs). Results were analyzed as in Fig. 2. The Venn diagram depicts the numbers of genes underrepresented in the CD34+/CD38/Lin population from one, two, and/or all three tissues. Genes underrepresented in the CD34+/CD38/Lin population from (A) all three tissues are listed in Table S4. The following lists of genes and associated expression values are provided in separate worksheets within Table S3 (Microsoft Excel): genes overrepresented in the CD34+/CD38/Lin populations from (B) BM, (C) CB, or (D) PBSCs; from (E) BM and CB; from (F) BM and PBSCs; or from (G) CB and PBSCs. The functional categorization of the 90 genes overexpressed in the CD34+/CD38/Lin populations from all three tissues, based on the Gene Ontology Consortium classification system, is shown in B.

Fig. 3.

Genes underrepresented in the CD34+/CD38/Lin cell population compared with the CD34+/[CD38/Lin]++ population from bone marrow (BM), cord blood (CB), and/or peripheral blood stem-progenitor cells (PBSCs). Results were analyzed as in Fig. 2. The Venn diagram depicts the numbers of genes underrepresented in the CD34+/CD38/Lin population from one, two, and/or all three tissues. Genes underrepresented in the CD34+/CD38/Lin population from (A) all three tissues are listed in Table S4. The following lists of genes and associated expression values are provided in separate worksheets within Table S3 (Microsoft Excel): genes overrepresented in the CD34+/CD38/Lin populations from (B) BM, (C) CB, or (D) PBSCs; from (E) BM and CB; from (F) BM and PBSCs; or from (G) CB and PBSCs. The functional categorization of the 90 genes overexpressed in the CD34+/CD38/Lin populations from all three tissues, based on the Gene Ontology Consortium classification system, is shown in B.

Close modal
Table 1

Genes overrepresented in the CD34+/CD38/Lin population from all three tissues (BM, CB, PBSC)a

Common name(s)bFold changeUniGenedKnown/(Probable) function
BM SAGEcBMCBPBSC
AD036 mRNA ND 3.49 2.29 4.19 (AF260333.1) Unknown 
ARG2 3.0 3.73 2.46 2.01 Hs.172851 Nitric oxide and polyamine metabolism 
BIRC3 2.0 2.39 3.45 2.17 Hs.127799 Inhibitor of apoptosis 
BST2 2.3 8.88 3.92 3.36 Hs.118110 (Growth and development of B-cell) 
CD37 2.3 5.10 2.64 2.42 Hs.153053 (Signal transduction, T-cell-B-cell interactions) 
CD52* 1.0 25.10 4.28 2.88 Hs.276770 Unknown 
cDNA DKFZp434C1915* ND 10.56 3.32 4.85 Hs.46531 Unknown 
cDNA DKFZp434G012* ND 13.65 2.33 3.38 Hs.303154 Unknown 
cDNA DKFZp564E227* ND 4.29 2.06 2.83 (AL136693.1) Unknown 
cDNA DKFZp564F053* 3.5 4.29 2.01 2.05 Hs.71968 Unknown 
cDNA DKFZp586J0323* HSC 7.14 4.83 2.73 Hs.102301 Unknown 
cDNA FLJ14054* 3.0 3.30 4.78 2.31 Hs.13528 Unknown 
cDNA FLJ20378* HSC 3.55 1.76 2.03 Hs.136252 Unknown 
cDNA FLJ21472, KIAA1939* ND 4.41 2.31 2.17 Hs.182738 Unknown 
cDNA FLJ22690* HSC 11.68 2.09 2.71 Hs.105468 Unknown 
cDNA FLJ40058* ND 2.84 1.70 2.72 Hs.376041 Unknown 
CEBPB HSC 2.25 2.05 2.38 Hs.99029 Transcription factor with bZIP-domain 
CIS2, SOCS-2* 2.4 7.55 3.14 2.35 Hs.351744 (Regulation of insulin-like growth factor 1) 
CLECSF2* 2.3 2.53 2.16 2.07 Hs.85201 Unknown 
COX6B 4.5 8.28 2.37 2.63 Hs.174031 Subunit VIb of cytochrome c oxidase 
CRFBP, CRF-BP* 12.0 34.62 9.85 3.96 Hs.115617 Inhibits CRH in plasma 
cDNA DKFZP434J214* 2.7 4.16 2.10 2.11 Hs.12813 Unknown (role in telomere maintenance) 
ECM ND 5.77 2.10 3.54 Hs.268107 Actor V/Va-binding protein, ECM adhesion 
EST* 2.0 2.20 2.39 2.43 Hs.156044 Unknown 
EST* HSC 2.34 1.75 2.16 Hs.272148 Unknown, similar to PRO0478 protein 
FOSB, GOS3, GOSB HSC 2.53 3.53 2.38 Hs.75678 Dimerizes with proteins of the JUN family 
GATA3, HDR, MGC5445 ND 4.84 4.07 4.23 Hs.169946 GATA family member; T-cell antigen regulation 
GBP2 2.2 4.55 2.31 2.04 Hs.171862 GTPase that converts GTP to GDP and GMP 
GERP, TRIM8* 3.0 3.15 2.06 2.83 Hs.54580 (Tumor suppressor) 
GUCY1A3 ND 4.29 2.72 2.06 Hs.75295 Subunit of soluble guanylate cyclase 
GUCY1B3 HSC 2.06 2.00 2.15 Hs.77890 Subunit of soluble guanylate cyclase 
H1F0, H10, H1FV 2.2 4.72 4.05 2.34 Hs.226117 Nucleosomes and high-order chromatin structures 
H1F2, H1.2 ND 2.72 5.42 3.23 Hs.7644 Nucleosomes and high-order chromatin structures 
H2A, member L ND 4.53 4.17 3.43 (AL353759) Unknown 
H2AFA, H2A.2 ND 2.42 5.06 4.03 Hs.121017 Compaction of DNA into nucleosomes 
H2AFO, H2A.2 ND 2.26 5.90 3.20 Hs.795 Compaction of DNA into nucleosomes 
H2AFO, H2A.2 ND 2.02 5.66 2.87 Hs.795 Compaction of DNA into nucleosomes 
H2B ND 3.72 3.89 5.06 (AL353759) Unknown 
H2BFA, H2B.1A, 2.0 3.35 2.90 3.54 Hs.352109 Compaction of DNA into nucleosomes 
H2BFB, H2B/b 2.0 2.72 4.56 3.66 Hs.180779 Compaction of DNA into nucleosomes 
H2BFG, H2B/g 2.0 2.18 4.06 5.35 Hs.182137 Compaction of DNA into nucleosome 
H2BFL, H2B.13 2.0 2.61 7.21 3.70 Hs.356901 Compaction of DNA into nucleosomes 
H2BFQ, H2B, 2.0 3.01 6.65 2.81 Hs.2178 Compaction of DNA into nucleosomes 
H2BFT, H2B/S, H2BFAiii 2.0 2.53 3.14 3.70 Hs.247817 Member of the histone H2B family (unknown) 
H3FB, H3/b, HIST1H3D 2.0 2.04 2.26 5.98 Hs.143042 Compaction of DNA into nucleosomes 
H3GK, H3/k, H3F1K 2.7 2.01 3.26 3.35 Hs.70937 Compaction of DNA into nucleosomes 
HLA-DQA1, HLA-DQ 0.2 9.36 2.02 2.02 Hs.198253 Binds and presents peptides to CD4+ T cells 
HLA-DQB1, HLA-DQB 1.0 2.92 3.58 2.89 Hs.73931 Binds and presents peptides to CD4+ T cells 
HLA-E 4.8 9.17 3.19 2.05 Hs.381008 Nonclassical MHC I; bind β-2-microglobulin 
HLF* 32.0 60.89 12.90 10.19 Hs.250692 (Transcription factor) 
HOXA3* ND 6.72 5.03 3.08 Hs.248074 Transcription factor 
HOXB6 ND 11.04 1.81 5.37 Hs.183096 Transcription factor 
HPIP 3.6 5.89 2.73 2.35 Hs.8068 Inhibits the binding of PBX1-HOX to DNA 
HSP25 4.0 3.06 7.21 4.98 Hs.76067 (Thermotolerance and drug resistance) 
HSPC053* 4.5 8.50 3.73 2.93 Hs.128155 Unknown 
HUSI-II, SPINK2 7.3 13.29 3.07 3.22 Hs.98243 Protease inhibitor 
IDI1 HSC 2.02 1.22 2.95 Hs.76038 Cholesterol metabolism 
IEGF, PDGFD, MSTP036 2.8 2.69 2.21 2.64 Hs.112885 Mitogenic factor for cells of mesenchymal origin 
INPP4B 2.4 4.19 2.44 2.67 Hs.153687 Phosphatidylinositol signaling 
KIAA0125* ND 5.50 2.39 2.23 Hs.38365 Unknown 
KIAA1102 5.0 20.30 6.79 3.92 Hs.202949 Unknown 
KLF2* HSC 3.31 3.18 5.31 Hs.107740 Transcription factor 
KLF4* 2.5 4.93 2.42 2.48 Hs.356370 Transcription factor 
LAGY, HOP* 2.5 12.95 3.59 2.83 Hs.13775 Unknown 
MDS1 ND 3.64 4.86 2.95 Hs.54504 Unknown 
MLLT3* ND 6.10 4.03 2.45 Hs.404 Unknown 
MPLV, CD110* HSC 20.48 2.28 2.64 Hs.84171 Hematopoietic receptor superfamily member 
NPR3 ND 3.97 2.43 2.07 Hs.123655 Involved in clearance of natriuretic peptides 
NRIP1 HSC 5.55 5.04 3.75 Hs.155017 Modulates activity of the estrogen receptor 
PLS3 3.3 4.18 2.35 2.43 Hs.4114 Actin-binding protein, hemopoietic cell lineages 
PPM1F, FEM-2, POPX2 3.0 6.08 2.86 3.09 Hs.278441 Negative regulator of p21-activated kinase (PAK) 
PRKCH 2.5 8.41 2.72 2.15 Hs.315366 Binds phorbol esters 
RA-GEF, KIAA0313 3.0 4.04 2.94 2.87 Hs.154545 Unknown 
RBPMS, HERMES* HSC 52.96 3.92 5.25 Hs.80248 (RNA metabolism) 
ROBO4* 2.0 11.91 7.92 5.22 Hs.111518 Unknown, low similarity to ROBO1 
RPS21 2.0 3.18 1.48 2.10 Hs.356317 Component of the small 40S ribosomal subunit 
SPTBN1 7.3 5.92 3.26 2.51 Hs.107164 Actin cross-linking proteins 
TFPI HSC 4.11 2.44 2.34 Hs.170279 Kunitz-type protease inhibitor 
TRAIL 2.0 3.32 2.27 3.41 Hs.83429 Cytokine. Activation of MAPK8/JNK, caspase 8/3 
TLOC1 2.0 3.89 2.38 2.52 Hs.8146 Protein translocation apparatus of the ER 
Unnamed* ND 3.43 1.56 2.47 Hs.130694 Unknown 
WWP1 2.0 3.32 2.26 2.06 Hs.355977 Unknown 
Common name(s)bFold changeUniGenedKnown/(Probable) function
BM SAGEcBMCBPBSC
AD036 mRNA ND 3.49 2.29 4.19 (AF260333.1) Unknown 
ARG2 3.0 3.73 2.46 2.01 Hs.172851 Nitric oxide and polyamine metabolism 
BIRC3 2.0 2.39 3.45 2.17 Hs.127799 Inhibitor of apoptosis 
BST2 2.3 8.88 3.92 3.36 Hs.118110 (Growth and development of B-cell) 
CD37 2.3 5.10 2.64 2.42 Hs.153053 (Signal transduction, T-cell-B-cell interactions) 
CD52* 1.0 25.10 4.28 2.88 Hs.276770 Unknown 
cDNA DKFZp434C1915* ND 10.56 3.32 4.85 Hs.46531 Unknown 
cDNA DKFZp434G012* ND 13.65 2.33 3.38 Hs.303154 Unknown 
cDNA DKFZp564E227* ND 4.29 2.06 2.83 (AL136693.1) Unknown 
cDNA DKFZp564F053* 3.5 4.29 2.01 2.05 Hs.71968 Unknown 
cDNA DKFZp586J0323* HSC 7.14 4.83 2.73 Hs.102301 Unknown 
cDNA FLJ14054* 3.0 3.30 4.78 2.31 Hs.13528 Unknown 
cDNA FLJ20378* HSC 3.55 1.76 2.03 Hs.136252 Unknown 
cDNA FLJ21472, KIAA1939* ND 4.41 2.31 2.17 Hs.182738 Unknown 
cDNA FLJ22690* HSC 11.68 2.09 2.71 Hs.105468 Unknown 
cDNA FLJ40058* ND 2.84 1.70 2.72 Hs.376041 Unknown 
CEBPB HSC 2.25 2.05 2.38 Hs.99029 Transcription factor with bZIP-domain 
CIS2, SOCS-2* 2.4 7.55 3.14 2.35 Hs.351744 (Regulation of insulin-like growth factor 1) 
CLECSF2* 2.3 2.53 2.16 2.07 Hs.85201 Unknown 
COX6B 4.5 8.28 2.37 2.63 Hs.174031 Subunit VIb of cytochrome c oxidase 
CRFBP, CRF-BP* 12.0 34.62 9.85 3.96 Hs.115617 Inhibits CRH in plasma 
cDNA DKFZP434J214* 2.7 4.16 2.10 2.11 Hs.12813 Unknown (role in telomere maintenance) 
ECM ND 5.77 2.10 3.54 Hs.268107 Actor V/Va-binding protein, ECM adhesion 
EST* 2.0 2.20 2.39 2.43 Hs.156044 Unknown 
EST* HSC 2.34 1.75 2.16 Hs.272148 Unknown, similar to PRO0478 protein 
FOSB, GOS3, GOSB HSC 2.53 3.53 2.38 Hs.75678 Dimerizes with proteins of the JUN family 
GATA3, HDR, MGC5445 ND 4.84 4.07 4.23 Hs.169946 GATA family member; T-cell antigen regulation 
GBP2 2.2 4.55 2.31 2.04 Hs.171862 GTPase that converts GTP to GDP and GMP 
GERP, TRIM8* 3.0 3.15 2.06 2.83 Hs.54580 (Tumor suppressor) 
GUCY1A3 ND 4.29 2.72 2.06 Hs.75295 Subunit of soluble guanylate cyclase 
GUCY1B3 HSC 2.06 2.00 2.15 Hs.77890 Subunit of soluble guanylate cyclase 
H1F0, H10, H1FV 2.2 4.72 4.05 2.34 Hs.226117 Nucleosomes and high-order chromatin structures 
H1F2, H1.2 ND 2.72 5.42 3.23 Hs.7644 Nucleosomes and high-order chromatin structures 
H2A, member L ND 4.53 4.17 3.43 (AL353759) Unknown 
H2AFA, H2A.2 ND 2.42 5.06 4.03 Hs.121017 Compaction of DNA into nucleosomes 
H2AFO, H2A.2 ND 2.26 5.90 3.20 Hs.795 Compaction of DNA into nucleosomes 
H2AFO, H2A.2 ND 2.02 5.66 2.87 Hs.795 Compaction of DNA into nucleosomes 
H2B ND 3.72 3.89 5.06 (AL353759) Unknown 
H2BFA, H2B.1A, 2.0 3.35 2.90 3.54 Hs.352109 Compaction of DNA into nucleosomes 
H2BFB, H2B/b 2.0 2.72 4.56 3.66 Hs.180779 Compaction of DNA into nucleosomes 
H2BFG, H2B/g 2.0 2.18 4.06 5.35 Hs.182137 Compaction of DNA into nucleosome 
H2BFL, H2B.13 2.0 2.61 7.21 3.70 Hs.356901 Compaction of DNA into nucleosomes 
H2BFQ, H2B, 2.0 3.01 6.65 2.81 Hs.2178 Compaction of DNA into nucleosomes 
H2BFT, H2B/S, H2BFAiii 2.0 2.53 3.14 3.70 Hs.247817 Member of the histone H2B family (unknown) 
H3FB, H3/b, HIST1H3D 2.0 2.04 2.26 5.98 Hs.143042 Compaction of DNA into nucleosomes 
H3GK, H3/k, H3F1K 2.7 2.01 3.26 3.35 Hs.70937 Compaction of DNA into nucleosomes 
HLA-DQA1, HLA-DQ 0.2 9.36 2.02 2.02 Hs.198253 Binds and presents peptides to CD4+ T cells 
HLA-DQB1, HLA-DQB 1.0 2.92 3.58 2.89 Hs.73931 Binds and presents peptides to CD4+ T cells 
HLA-E 4.8 9.17 3.19 2.05 Hs.381008 Nonclassical MHC I; bind β-2-microglobulin 
HLF* 32.0 60.89 12.90 10.19 Hs.250692 (Transcription factor) 
HOXA3* ND 6.72 5.03 3.08 Hs.248074 Transcription factor 
HOXB6 ND 11.04 1.81 5.37 Hs.183096 Transcription factor 
HPIP 3.6 5.89 2.73 2.35 Hs.8068 Inhibits the binding of PBX1-HOX to DNA 
HSP25 4.0 3.06 7.21 4.98 Hs.76067 (Thermotolerance and drug resistance) 
HSPC053* 4.5 8.50 3.73 2.93 Hs.128155 Unknown 
HUSI-II, SPINK2 7.3 13.29 3.07 3.22 Hs.98243 Protease inhibitor 
IDI1 HSC 2.02 1.22 2.95 Hs.76038 Cholesterol metabolism 
IEGF, PDGFD, MSTP036 2.8 2.69 2.21 2.64 Hs.112885 Mitogenic factor for cells of mesenchymal origin 
INPP4B 2.4 4.19 2.44 2.67 Hs.153687 Phosphatidylinositol signaling 
KIAA0125* ND 5.50 2.39 2.23 Hs.38365 Unknown 
KIAA1102 5.0 20.30 6.79 3.92 Hs.202949 Unknown 
KLF2* HSC 3.31 3.18 5.31 Hs.107740 Transcription factor 
KLF4* 2.5 4.93 2.42 2.48 Hs.356370 Transcription factor 
LAGY, HOP* 2.5 12.95 3.59 2.83 Hs.13775 Unknown 
MDS1 ND 3.64 4.86 2.95 Hs.54504 Unknown 
MLLT3* ND 6.10 4.03 2.45 Hs.404 Unknown 
MPLV, CD110* HSC 20.48 2.28 2.64 Hs.84171 Hematopoietic receptor superfamily member 
NPR3 ND 3.97 2.43 2.07 Hs.123655 Involved in clearance of natriuretic peptides 
NRIP1 HSC 5.55 5.04 3.75 Hs.155017 Modulates activity of the estrogen receptor 
PLS3 3.3 4.18 2.35 2.43 Hs.4114 Actin-binding protein, hemopoietic cell lineages 
PPM1F, FEM-2, POPX2 3.0 6.08 2.86 3.09 Hs.278441 Negative regulator of p21-activated kinase (PAK) 
PRKCH 2.5 8.41 2.72 2.15 Hs.315366 Binds phorbol esters 
RA-GEF, KIAA0313 3.0 4.04 2.94 2.87 Hs.154545 Unknown 
RBPMS, HERMES* HSC 52.96 3.92 5.25 Hs.80248 (RNA metabolism) 
ROBO4* 2.0 11.91 7.92 5.22 Hs.111518 Unknown, low similarity to ROBO1 
RPS21 2.0 3.18 1.48 2.10 Hs.356317 Component of the small 40S ribosomal subunit 
SPTBN1 7.3 5.92 3.26 2.51 Hs.107164 Actin cross-linking proteins 
TFPI HSC 4.11 2.44 2.34 Hs.170279 Kunitz-type protease inhibitor 
TRAIL 2.0 3.32 2.27 3.41 Hs.83429 Cytokine. Activation of MAPK8/JNK, caspase 8/3 
TLOC1 2.0 3.89 2.38 2.52 Hs.8146 Protein translocation apparatus of the ER 
Unnamed* ND 3.43 1.56 2.47 Hs.130694 Unknown 
WWP1 2.0 3.32 2.26 2.06 Hs.355977 Unknown 
a

BM, bone marrow; CB, cord blood; PBSC, peripheral blood stem-progenitor cell; SAGE, serial analysis of gene expression; HSC, hematopoietic stem cell; ND, not determined.

b

Genes marked with an asterisk (*) were confirmed by quantitative reverse transcription-PCR.

c

HSC denotes that SAGE tags were only detected in the HSC population and not in the HPC population. ND indicates that unique, reliable SAGE tags were not available for this transcript.

d

UniGene cluster numbers are given when available. Those numbers in parenthesis indicate the GenBank accession number for those genes that have not been assigned UniGene Cluster numbers.

Table 2

Overrepresented genes also found to be differentially expressed in recent microarray studiesa

Gene namebHuman UniGene IDMouse UniGene IDAverage fold changecSantos mousedIvanova humaneIvanova mousefBM SAGEg
HLF* Hs.250692 mm.45146 27.99 no YES YES YES 
HERMES* Hs.80248 mm.12436 20.71 YES YES ND YES 
CD110* Hs.84171 mm.4864 8.47 YES no YES YES 
ROBO4* Hs.111518 mm.27782 8.35 no ND YES YES 
HOXB6 Hs.183096 mm.215 6.07 no YES no ND 
GATA3 Hs.169946 mm.606 4.38 YES YES YES ND 
SOCS-2* Hs.351744 mm.4132 4.35 YES ND no YES 
SPTBN1 Hs.107164 mm.3601 3.89 YES no no YES 
MDS1 Hs.54504 mm.56965 3.82 no YES YES ND 
KLF4* Hs.356370 mm.4325 3.28 no ND YES YES 
TRAIL Hs.83429 mm.1062 3.00 no YES ND YES 
GBP2 Hs.171862 mm.24038 2.97 YES ND ND YES 
DKFZP434J214* Hs.12813 mm.21712 2.79 YES ND ND YES 
CEBPB Hs.99029 mm.4863 2.23 YES YES no YES 
Gene namebHuman UniGene IDMouse UniGene IDAverage fold changecSantos mousedIvanova humaneIvanova mousefBM SAGEg
HLF* Hs.250692 mm.45146 27.99 no YES YES YES 
HERMES* Hs.80248 mm.12436 20.71 YES YES ND YES 
CD110* Hs.84171 mm.4864 8.47 YES no YES YES 
ROBO4* Hs.111518 mm.27782 8.35 no ND YES YES 
HOXB6 Hs.183096 mm.215 6.07 no YES no ND 
GATA3 Hs.169946 mm.606 4.38 YES YES YES ND 
SOCS-2* Hs.351744 mm.4132 4.35 YES ND no YES 
SPTBN1 Hs.107164 mm.3601 3.89 YES no no YES 
MDS1 Hs.54504 mm.56965 3.82 no YES YES ND 
KLF4* Hs.356370 mm.4325 3.28 no ND YES YES 
TRAIL Hs.83429 mm.1062 3.00 no YES ND YES 
GBP2 Hs.171862 mm.24038 2.97 YES ND ND YES 
DKFZP434J214* Hs.12813 mm.21712 2.79 YES ND ND YES 
CEBPB Hs.99029 mm.4863 2.23 YES YES no YES 
a

′YES′ signifies that the gene was overrepresented, ′no′ that the genes was not overrepresented, and ′ND′ that the gene expression was not determined.

b

Genes marked with an asterisk (*) have been confirmed by real-time PCR. (Fig. S2).

c

The average of the fold changes from BM, cord blood, and peripheral blood stem-progenitor cells.

d

Mouse side population cells from Table S3 of Ramalho-Santos et al.(11).

e

Human fetal liver hematopoietic stem cell from Database S3 of Ivanova et al.(12).

f

Mouse BM and fetal liver hematopoietic stem cell from Table S2 and Database S2 of Ivanova et al.(12).

g

BM, bone marrow; SAGE, serial analysis of gene expression.

1
Civin CI, Trischmann T, Kadan NS, et al Highly purified CD34-positive cells reconstitute hematopoiesis.
J Clin Oncol
,
14
:
2224
-33,  
1996
.
2
Larochelle A, Vormoor J, Hanenberg H, et al Identification of primitive human hematopoietic cells capable of repopulating NOD/SCID mouse bone marrow: implications for gene therapy.
Nat Med
,
2
:
1329
-37,  
1996
.
3
Krause DS, Fackler MJ, Civin CI, May WS. CD34: structure, biology, and clinical utility.
Blood
,
87
:
1
-13,  
1996
.
4
Civin CI, Strauss LC, Brovall C, Fackler MJ, Schwartz JF, Shaper JH. Antigenic analysis of hematopoiesis. III. A hematopoietic progenitor cell surface antigen defined by a monoclonal antibody raised against KG-1a cells.
J Immunol
,
133
:
157
-65,  
1984
.
5
Bhatia M, Bonnet D, Murdoch B, Gan OI, Dick JE. A newly discovered class of human hematopoietic cells with SCID-repopulating activity.
Nat Med
,
4
:
1038
-45,  
1998
.
6
Phillips RL, Ernst RE, Brunk B, et al The genetic program of hematopoietic stem cells.
Science (Wash. DC)
,
288
:
1635
-40,  
2000
.
7
Terskikh AV, Easterday MC, Li L, et al From hematopoiesis to neuropoiesis: evidence of overlapping genetic programs.
Proc Natl Acad Sci USA
,
98
:
7934
-9,  
2001
.
8
Park IK, He Y, Lin F, et al Differential gene expression profiling of adult murine hematopoietic stem cells.
Blood
,
99
:
488
-98,  
2002
.
9
Akashi K, He X, Chen J, et al Transcriptional accessibility for genes of multiple tissues and hematopoietic lineages is hierarchically controlled during early hematopoiesis.
Blood
,
101
:
383
-9,  
2003
.
10
Steidl U, Kronenwett R, Rohr UP, et al Gene expression profiling identifies significant differences between the molecular phenotypes of bone marrow-derived and circulating human CD34+ hematopoietic stem cells.
Blood
,
99
:
2037
-44,  
2002
.
11
Ramalho-Santos M, Yoon S, Matsuzaki Y, Mulligan RC, Melton DA. “Stemness”: transcriptional profiling of embryonic and adult stem cells.
Science (Wash. DC)
,
298
:
597
-600,  
2002
.
12
Ivanova NB, Dimos JT, Schaniel C, Hackney JA, Moore KA, Lemischka IR. A stem cell molecular signature.
Science (Wash. DC)
,
298
:
601
-4,  
2002
.
13
Fortunel NO, Otu HH, Ng HH, et al. Comment on “ ’Stemness’: transcriptional profiling of embryonic and adult stem cells” and “a stem cell molecular signature.” Science (Wash. DC) 2003;302:393; author reply 393.
14
Evsikov AV, Solter D. Comment on “ ’Stemness’: transcriptional profiling of embryonic and adult stem cells” and “a stem cell molecular signature”. Science (Wash. DC) 2003;302:393; author reply 393.
15
Vogel G. Stem cells. ’Stemness’ genes still elusive.
Science (Wash. DC)
,
302
:
371
2003
.
16
Civin CI, Almeida-Porada G, Lee MJ, Olweus J, Terstappen LW, Zanjani ED. Sustained, retransplantable, multilineage engraftment of highly purified adult human bone marrow stem cells in vivo.
Blood
,
88
:
4102
-9,  
1996
.
17
Gao Z, Fackler MJ, Leung W, et al Human CD34+ cell preparations contain over 100-fold greater NOD/SCID mouse engrafting capacity than do CD34- cell preparations.
Exp Hematol
,
29
:
910
-21,  
2001
.
18
Leung W, Ramirez M, Civin CI. Quantity and quality of engrafting cells in cord blood and autologous mobilized peripheral blood.
Biol Blood Marrow Transplant
,
5
:
69
-76,  
1999
.
19
Velculescu VE, Vogelstein B, Kinzler KW. Analysing uncharted transcriptomes with SAGE.
Trends Genet
,
16
:
423
-5,  
2000
.
20
Velculescu VE, Zhang L, Vogelstein B, Kinzler KW. Serial analysis of gene expression.
Science (Wash. DC)
,
270
:
484
-7,  
1995
.
21
Velculescu VE, Madden SL, Zhang L, et al Analysis of human transcriptomes.
Nat Genet
,
23
:
387
-8,  
1999
.
22
Kowalski J, Powell J. Nonparametric inference for stochastic linear hypotheses: application to high-dimensional data. Bioinformatics. In press 2004.
23
Kowalski J, Drake C, Schwartz RH, Powell J. Nonparametric, hypothesis-based analysis of microarrays for comparison of several phenotypes.
Bioinformatics
,
20
:
364
-73,  
2004
.
24
Kanehisa M, Goto S, Kawashima S, Nakaya A. The KEGG databases at GenomeNet.
Nucleic Acids Res
,
30
:
42
-6,  
2002
.
25
Datson NA, van der Perk-de Jong J, van den Berg MP, de Kloet ER, Vreugdenhil E. MicroSAGE: a modified procedure for serial analysis of gene expression in limited amounts of tissue.
Nucleic Acids Res
,
27
:
1300
-7,  
1999
.
26
Man MZ, Wang X, Wang Y. POWER SAGE: comparing statistical tests for SAGE experiments.
Bioinformatics
,
16
:
953
-9,  
2000
.
27
Becquet C, Blachon S, Jeudy B, Boulicaut JF, Gandrillon O. Strong-association-rule mining for large-scale gene-expression data analysis: a case study on human SAGE data. Genome Biol [serial on the Internet]. 1979 [cited 2003];3:[about 16 p.]. Available from: http://genomebiology.com/2002/3/12/research/0067.
28
Ruijter JM, Van Kampen AH, Baas F. Statistical evaluation of SAGE libraries: consequences for experimental design.
Physiol Genomics
,
11
:
37
-44,  
2002
.
29
van Ruissen F, Jansen BJ, de Jongh GJ, van Vlijmen-Willems IM, Schalkwijk J. Differential gene expression in premalignant human epidermis revealed by cluster analysis of serial analysis of gene expression (SAGE) libraries.
FASEB J
,
16
:
246
-8,  
2002
.
30
Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method.
Methods
,
25
:
402
-8,  
2001
.
31
Zhou G, Chen J, Lee S, Clark T, Rowley JD, Wang SM. The pattern of gene expression in human CD34(+) stem/progenitor cells.
Proc Natl Acad Sci USA
,
98
:
13966
-71,  
2001
.
32
Wang W, Wang X, Ward AC, Touw IP, Friedman AD. C/EBPalpha and G-CSF receptor signals cooperate to induce the myeloperoxidase and neutrophil elastase genes.
Leukemia (Baltimore)
,
15
:
779
-86,  
2001
.
33
Friedman AD. Regulation of immature myeloid cell differentiation by PEBP2/CBF, Myb, C/EBP and Ets family members.
Curr Top Microbiol Immunol
,
211
:
149
-57,  
1996
.
34
Friedman AD, Britos-Bray M, Suzow J. The murine myeloperoxidase gene contains a bipartite distal enhancer, including a novel region regulated by PEBP2/CBF.
Leuk Res
,
20
:
809
-15,  
1996
.
35
Wani MA, Wert SE, Lingrel JB. Lung Kruppel-like factor, a zinc finger transcription factor, is essential for normal lung development.
J Biol Chem
,
274
:
21180
-5,  
1999
.
36
Rosati M, Valentin A, Patenaude DJ, Pavlakis GN. CCAAT-enhancer-binding protein beta (C/EBP beta) activates CCR5 promoter: increased C/EBP beta and CCR5 in T lymphocytes from HIV-1-infected individuals.
J Immunol
,
167
:
1654
-62,  
2001
.
37
Zhu S, Yoon K, Sterneck E, Johnson PF, Smart RC. CCAAT/enhancer binding protein-beta is a mediator of keratinocyte survival and skin tumorigenesis involving oncogenic Ras signaling.
Proc Natl Acad Sci USA
,
99
:
207
-12,  
2002
.
38
Cambot M, Aresta S, Kahn-Perles B, de Gunzburg J, Romeo PH. Human immune associated nucleotide 1: a member of a new guanosine triphosphatase family expressed in resting T and B cells.
Blood
,
99
:
3293
-301,  
2002
.
39
Balavoine G, de Rosa R, Adoutte A. Hox clusters and bilaterian phylogeny.
Mol Phylogenet Evol
,
24
:
366
-73,  
2002
.
40
Prince V. The Hox Paradox: More complex(es) than imagined.
Dev Biol
,
249
:
1
-15,  
2002
.
41
Chisaka O, Capecchi MR. Regionally restricted developmental defects resulting from targeted disruption of the mouse homeobox gene hox-1.5.
Nature (Lond.)
,
350
:
473
-9,  
1991
.
42
Watari N, Kameda Y, Takeichi M, Chisaka O. Hoxa3 regulates integration of glossopharyngeal nerve precursor cells.
Dev Biol
,
240
:
15
-31,  
2001
.
43
Manley NR, Capecchi MR. Hox group 3 paralogs regulate the development and migration of the thymus, thyroid, and parathyroid glands.
Dev Biol
,
195
:
1
-15,  
1998
.
44
Su DM, Manley NR. Hoxa3 and pax1 transcription factors regulate the ability of fetal thymic epithelial cells to promote thymocyte development.
J Immunol
,
164
:
5753
-60,  
2000
.
45
Shen WF, Detmer K, Mathews CH, et al Modulation of homeobox gene expression alters the phenotype of human hematopoietic cell lines.
EMBO J
,
11
:
983
-9,  
1992
.
46
Magli MC, Largman C, Lawrence HJ. Effects of HOX homeobox genes in blood cell differentiation.
J Cell Physiol
,
173
:
168
-77,  
1997
.
47
Sauvageau G, Lansdorp PM, Eaves CJ, et al Differential expression of homeobox genes in functionally distinct CD34+ subpopulations of human bone marrow cells.
Proc Natl Acad Sci USA
,
91
:
12223
-7,  
1994
.
48
Giampaolo A, Felli N, Diverio D, et al Expression pattern of HOXB6 homeobox gene in myelomonocytic differentiation and acute myeloid leukemia.
Leukemia (Baltimore)
,
16
:
1293
-301,  
2002
.
49
Kappen C. Disruption of the homeobox gene Hoxb-6 in mice results in increased numbers of early erythrocyte progenitors.
Am J Hematol
,
65
:
111
-8,  
2000
.
50
Rhodes DR, Barrette TR, Rubin MA, Ghosh D, Chinnaiyan AM. Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer.
Cancer Res
,
62
:
4427
-33,  
2002
.
51
Khan J, Bittner ML, Chen Y, Meltzer PS, Trent JM. DNA microarray technology: the anticipated impact on the study of human disease.
Biochim Biophys Acta
,
1423
:
M17
-28,  
1999
.
52
Lapidot T, Sirard C, Vormoor J, et al A cell initiating human acute myeloid leukaemia after transplantation into SCID mice.
Nature (Lond.)
,
367
:
645
-8,  
1994
.
53
Lapidot T, Grunberger T, Vormoor J, et al Identification of human juvenile chronic myelogenous leukemia stem cells capable of initiating the disease in primary and secondary SCID mice.
Blood
,
88
:
2655
-64,  
1996
.
54
Reya T, Morrison SJ, Clarke MF, Weissman IL. Stem cells, cancer, and cancer stem cells.
Nature (Lond.)
,
414
:
105
-11,  
2001
.
55
Hemmati HD, Nakano I, Lazareff JA, et al Cancerous stem cells can arise from pediatric brain tumors.
Proc Natl Acad Sci USA
,
100
:
15178
-83,  
2003
.
56
Al-Hajj M, Wicha MS, Benito-Hernandez A, Morrison SJ, Clarke MF. Prospective identification of tumorigenic breast cancer cells.
Proc Natl Acad Sci USA
,
100
:
3983
-8,  
2003
.
57
Bonnet D. Normal and leukemic CD34-negative human hematopoietic stem cells.
Rev Clin Exp. Hematol
,
5
:
42
-61,  
2001
.