Abstract
Chromosome 13q14 deletions constitute the most common structural aberration in B-cell chronic lymphocytic leukemia (B-CLL). We constructed a high-resolution physical map covering the critical deleted region in B-CLL at 13q14 and flanking sequences. The order and position of both genomic markers and known genes were determined precisely. Three novel genes, CLLD6, CLLD7, and CLLD8, were isolated and characterized. The predicted protein sequence of CLLD6 revealed no homology with known proteins. However, both CLLD7 and CLLD8 predicted proteins contain known functional domains. CLLD7 has both an RCC1 and a BTB domain, and could thus be involved in cell cycle regulation by chromatin remodeling. CLLD8 contains a methyl-CpG binding, a preSET and a SET domain, suggesting that CLLD8 might be associated with methylation-mediated transcriptional repression. Mutation analysis of hematopoietic tumor cell lines and B-CLL tumor samples revealed no point mutations within the coding region of these three novel genes. The functional domains present within CLLD7 and CLLD8 suggest that the proteins may be involved in critical cellular processes such as cell cycle and transcriptional control and could therefore be directly or indirectly involved in leukemogenesis.
Introduction
B-CLL3 is the most common leukemia in the United States and Western Europe, accounting for ∼30% of all cases of leukemia in adults. In >95% of patients, the disorder represents the accumulation of slowly proliferating, long-lived, B lymphocytes derived from a single clone. Genetic abnormalities are found in 50% of cases of B-CLL by cytogenetic analysis (1). The most common genetic abnormalities are deletions or translocations of chromosome 13 and trisomy 12. Trisomy 12 is found in about one-third of all studied B-CLL patients with clonal chromosomal abnormalities (2). Although the pathogenetic significance of trisomy 12 is unknown, trisomy 12 has been correlated with atypical lymphocyte morphology, higher proliferative activity, and poor prognosis (3). Chromosome 13 is the chromosome that is the most often involved in structural aberrations in B-CLL. Two-thirds of the structural aberrations are deletions, whereas one-third are translocations with associated deletions at chromosome 13q14 (1). The retinoblastoma tumor suppressor gene (RB1) is located at 13q14, and investigators have speculated that it represents an obvious candidate gene for involvement in B-CLL. Allelic loss analyses revealed that monoallelic loss of RB1 was a consistent finding in B-CLL with del 13q14 (4, 5) and may be found in cases with normal karyotype using Southern blot (4) or in situ hybridization analysis (6). However, biallelic deletions of the RB1 region are rarely observed in B-CLL. Furthermore, RB1 mRNA and RB1 protein are usually expressed in B-CLL, and no point mutations in RB1 have been found (5, 7). These studies suggest that at least one functional RB1 allele is retained in B-CLL. Thus, functional loss of RB1 protein does not seem to be relevant to the pathogenesis of B-CLL. To identify a candidate gene involved in B-CLL, further studies were aimed at more precisely defining the critical region. These studies have shown that the genetic marker D13S25, which lies 1.6 cM distal to RB1, is hemi- or homozygously deleted more frequently than RB1 (7, 8, 9, 10). Subsequent studies using genomic markers have defined the minimal deleted region around D13S25. Stilgenbauer et al. (11) reported a critical deleted region located between RB1 and D13S25. Liu et al. (12) found that the most commonly deleted region is around D13S319, which is located between RB1 and D13S25, whereas we have defined the smallest deleted region located between 206XF12 and D13S25 (13). Recently, several groups have constructed high-resolution physical maps of DNA fragments cloned in cosmids, P1 artificial chromosomes, and BACs between RB1 and D13S25 (14, 15, 16, 17, 18, 19, 20). These groups have also identified the transcribed sequences within the critical deleted region. Liu et al. (16) have narrowed the minimally deleted region to <10 kb adjacent to D13S319. They isolated two novel candidate genes named LEU1 and LEU2. However, no mutations have been detected in these genes (16, 21). Another candidate gene termed LEU5, located adjacent to LEU1 and LEU2, was isolated (17). LEU5 contains a zinc-finger domain of the RING type and shares homology to some known genes involved in tumorigenesis and embryogenesis. Another gene in the region, Karyopherin α3 (KPNA3), is located near D13S273 (20). This gene encodes a protein highly homologous to certain nuclear transport proteins, such as Xenopus importin, yeast SRP1, and human RCH1 (22). However, to our knowledge, no mutations were found in either gene in B-CLL. Our data (13) have suggested the presence of more than one region involved in loss of heterozygosity in B-CLL. Corcoran et al. (18) have described similar results.
In the present study, we present a high-resolution physical map extending 300 kb centromeric to D13S273 and spanning the most centromeric of the suggested regions of loss in B-CLL as well as a region of loss recently reported in B-cell non-Hodgkin’s lymphoma (23). The map covers ∼770 kb of genomic sequence and was assembled from both public database sequences and sequence obtained in our laboratory. On the basis of homology searches between our genomic sequence and public EST databases,4 we isolated and characterized three novel candidate genes CLLD6, CLLD7, and CLLD8 (CLL deletion gene 6, 7, and 8) from this extended region. Two of them, CLLD7 and CLLD8, show homology to known proteins and are likely to play critical roles in cell cycle control and regulation of gene expression.
Materials and Methods
Cell Lines.
Human hematopoietic tumor cell lines AG876, AS283, BL2, BL30, BL41, CA46, DA978, Daudi, EB-B, ED36, Jiyoye, Lauckes, Namalwa, P3HR1, Ramos, RS11864, SKDHL and WMN (Burkitt’s lymphoma), HuNS1, MC/CAR, NCI-H929, RPMI8226 and U266B1 (multiple myeloma), DB and SR (large cell lymphoma), JM1 (immunoblastic B cell lymphoma), HT (diffuse mix lymphoma), RPMI6666 and Hs445 (Hodgkin’s disease), and RL (non-Hodgkin’s disease) were obtained from the American Type Culture Collection (Manassas, VA) and maintained according to American Type Culture Collection instructions.
Patient Samples.
Tumor samples derived from peripheral blood from patients with a diagnosis of B-CLL were obtained after the patients’ informed consent. Peripheral blood mononuclear cells were separated by density-gradient centrifugation on Ficoll-Hypaque PLUS (Amersham Pharmacia Biotech, Piscataway, NJ). Peripheral blood mononuclear cells consisted of >90% leukemic B lymphocytes. DNA was then extracted by conventional phenol-chloroform extraction.
BAC Clones and Sequencing.
BAC clones CITB-369L16 and CITB-317G11 were obtained from Research Genetics (Huntsville, AL). BAC DNA was isolated using the Qiagen Maxi prep kit (Qiagen, Valencia, CA) according to the manufacturer’s protocol. BAC DNA libraries for shotgun sequencing were constructed and sequenced as described previously (24). A complete sequence RPCI11–432 M24 (GenBank accession number AL135901) and sequences in progress RPCI11–185C18 (AL139321), RPCI11–195L15 (AL136301), and RPCI11–236 M15 (AL136123) were obtained from The Human Genome Project public consortium.
Human EST Clones and Sequencing.
The cDNA clones corresponding to the novel EST clusters were purchased from Research Genetics and sequenced (ESTs on 236 M15: GenBank accession numbers AI364677 and T80031; ESTs on telomeric 185C18: AA311792 and AA186609; and ESTs on centromeric 185C18: AA188107 and AI374660). Sequencing reactions and analyses were performed by using the Applied Biosystems Prism BigDye terminator reaction chemistry on a Perkin-Elmer Gene Amp PCR system 9600 and the Applied Biosystems Prism 377 DNA sequencing system.
cDNA Library Screening.
To isolate full-length cDNA, the large-insert cDNA library from human fetal brain was purchased from Clontech and screened by using the supplier’s protocol with corresponding sequenced ESTs.
RACE.
The 3′ end of cloned mRNAs was obtained by RACE from fetal brain. Normal human fetal brain poly(A)+ was purchased from Clontech (Palo Alto, CA). First-strand cDNA was synthesized from 1 μg of human fetal brain poly(A)+ by using the SMART RACE cDNA amplification kit (Clontech). RACE reactions were performed according to SMART RACE protocol with Advantage 2 polymerase mix (Clontech); 3′ RACE products were generated by the touchdown PCR method using gene-specific primers and the universal primer mix (Clontech). To increase the specificity of the procedure, the secondary PCR was carried out by using nested gene-specific primers and nested universal primers (Clontech). The primary PCR conditions were 5 cycles of 94°C for 10 s, 72°C for 4 min; 5 cycles of 94°C for 10 s, 70°C for 30 s, 72°C for 4 min; and 32 cycles of 94°C for 10 s, 65°C for 30 s, 72°C for 4 min. A 1:50 dilution of the primary PCR product was used as template for the secondary PCR. Secondary PCR conditions were 25 cycles of 94°C for 10 s, 65°C for 30 s, and 72°C for 4 min. 3′-RACE primers were: CLLD6, GAATCTGGGGTATTGGTGTTGCAACTC (primary) and GTGAGTTTTATCATACGCCTCCACCTGGG (secondary); CLLD7, CATGGTGGATGTCGGAAAGTGGCCATC (primary) and GCCAGTGAAGCACTGTACGTTACTGAC (secondary); and CLLD8 CTTGTGGAAGGAGTCTACGAAACGTGG (primary) and CTGTGACTGCTCTGAGGGCTGCATAGAC (secondary).
Northern Blot Analysis.
Human multiple tissue Northern blots, as shown in Fig. 2 (Lanes 1–27), were purchased from Clontech. For Northern blots shown in Fig. 2 (Lanes 28–34), total RNA was extracted from human hematopoietic tumor cell lines by the Qiagen RNeasy Mini kit (Qiagen), according to the manufacturer’s protocol. Thirty μg of total RNA were separated by electrophoresis in 1.0% denaturing agarose gels and transferred to Hybond N+ positively charged nylon membranes (Amersham Pharmacia Biotech) by using standard procedures. The membranes were hybridized with the corresponding cDNA probe labeled with [α-32P]dCTP by random priming. Prehybridization and hybridization were carried out in 50% formamide, 5× SSPE, 10× Denhardt’s solution, 2% SDS, and 0.2 mg/ml heat-denatured salmon sperm DNA for 20–24 h at 42°C. Hybridized membranes were washed in 2× SSC, 0.05% SDS for 25 min at room temperature and 0.1× SSC, 0.1% SDS for 30 min at 50°C. Northern blots were stripped in 0.5% SDS for 15 min at 100°C and reused with other probes.
Reversed Transcription-PCR Analysis.
All cDNA sequences were confirmed by reverse transcription-PCR. cDNA was synthesized from 3 μg of total RNA or 1 μg of poly(A)+ using the Advantage RT-for-PCR kit (Clontech). Five μl of cDNA were used for each PCR with 10 pmol of each gene-specific primer for 35 cycles of 94°C for 15 s, 60–65°C for 30 s, and 72°C for 1–2 min. The cDNAs spanning the coding region of each gene were synthesized using CLLD6F (GATGGCCACCTCGGTGTTGTGCTGCCTG) and CLLD6R (CATTGGTTCCTTAGACGGCATCAACAAG) for the CLLD6 coding region, CLLD7F (GCACTTCTGCGCCCATTGGAGCTTCGG) and CLLD7R (CTTTTACCTGCAGAATCACATCACCCG) for the CLLD7 coding region, and CLLD8F (CACAGTTGGATTCCAGTGATATTCTGC) and CLLD8R (GATGGACCTAGACCTTCTTTTGCATGGC) for the CLLD8 coding region. The PCR products were separated on 1.0–2.0% agarose gels and either gel purified using the QIAquick gel extraction kit (Qiagen) according to the manufacturer’s instructions and sequenced or cloned in TA vector using TOPO TA Cloning (Invitrogen Carlsbad, CA) and sequenced.
Database Searches.
Database searches were performed using the BLAST network service on the National Center for Biotechnology Information.5 The advanced BLAST search program was used to detect homologies between BAC sequences and the nucleotide databases, between BAC sequences and the dbEST databases, and between predicted protein sequences and the protein databases. The CD-search program was used to detect similarities between predicted protein sequences and the conserved domain databases. Alignment of amino acid sequences was performed by the GeneDoc, version 2.6001, multiple sequence alignment editor program.6
Genomic PCR.
DNA extraction from hematopoietic cell lines was performed using the Qiagen Dneasy tissue kit (Qiagen) according to the manufacturer’s protocol. Primer sets used in mutation analysis were designed from intron sequences flanking each exon. PCRs were carried out for 35 cycles of 94°C for 15 s, 50–55°C for 30 s, and 72°C for 1 min. PCR products were purified with the QIAquick PCR purification kit (Qiagen), according to the manufacturer’s instructions or gel-purified and then sequenced directly.
Results
Physical Map of the B-CLL Critical Region.
To complete the characterization of the B-CLL critical deleted region, a BAC contig was constructed. We completely sequenced BAC clones 369L16 and 317G11 by shotgun sequencing. The position of BACs is illustrated in Fig. 1,A. The BAC contig spans ∼770 kb extending over 300 kb centromeric to D13S273. BAC sequences were used to query the nucleotide database using BLAST. Five known genes were found and exactly placed on the map: LEU1, LEU2 (16), LEU5 (17), Karyopherin α3 (22), and NY-REN-34 antigen (25). The position of these genes in the contig is shown in Fig. 1 A. To identify novel genes, we used the BAC-derived genomic sequence to search the human dbEST databases. Three independent clusters of ESTs that did not correspond to known genes were identified and characterized further.
Identification and Expression Analysis of the Novel Genes.
The most telomeric EST cluster was named CLLD6 (Fig. 1,A). A cDNA probe corresponding to this cluster recognized three transcripts on Northern blots of 1.1, 1.5, and 3.1 kb (Fig. 2). cDNA library screening and RACE experiments allowed us to isolate cDNA clones corresponding to all three transcripts. Sequence analysis revealed that the different transcripts result from the use of different polyadenylation sites. Comparison of the genomic sequence to the cDNA sequence revealed that CLLD6 spans 24 kb and is composed of five exons (Fig. 1,B). The putative open reading frame extends from exon 1 to exon 5 and encodes a 196-amino acid protein with a predicted molecular mass of 21.7 kDa. BLAST analysis of the protein databases revealed no significant homologies with known proteins. Northern blot analysis of normal human tissues and human cancer cell lines revealed that CLLD6 is widely expressed with the highest levels in heart, skeletal muscle, and testis and the lowest levels in thymus, peripheral blood leukocytes, lymph node, and bone marrow (Fig. 2).
Another EST cluster located ∼150 kb centromeric to D13S273 was named CLLD7 (Fig. 1,A). The corresponding probe recognizes a 4-kb mRNA, which is ubiquitously expressed (Fig. 2). The gene was found to span 54 kb of genomic DNA and consists of 13 exons (Fig. 1,B). Sequence analysis of the mRNA revealed a putative coding region of 1596 bp in length encoding for a protein of 531 amino acids with a predicted molecular mass of 58.2 kDa. BLAST analysis of the protein databases revealed the highest homology to human RLG also called CHC1L (for chromosome condensation 1-like; Ref. 26; Fig. 3,A). BLAST and CD searches revealed that CLLD7 contains two conserved domains: an NH2-terminal RCC1 domain and a COOH-terminal BTB domain (Fig. 3). Fig. 4,A shows the alignment of RCC1 domains from CLLD7, RLG, Ceb1 (27), and RCC1 (28). The RCC1 domain within CLLD7 is composed of six intradomain repeats of about 42–53 residues. These repeats consist of highly conserved or invariant residues at approximately the same positions in each repeat. Fig. 4,B shows the alignment of BTB domain with CLLD7, RLG, BCL6 (29), and PLZF (30). Most of the BTB domain is present near the NH2 terminus of a fraction of zinc finger proteins such as BCL6. However, the BTB domain of CLLD7 is localized at the COOH terminus of the protein (Fig. 3), and no zinc finger domain was found in CLLD7.
The most centromeric EST cluster was designated CLLD8 (Fig. 1,A). The corresponding gene spans 49 kb and consists of 15 exons (Fig. 1,B). The CLLD8 mRNA is ∼3.3 kb, and the putative coding region is 2160 bp in length and encodes a protein of 719 amino acids with a predicted molecular mass of 81.9 kDa. Northern blot analysis of normal human tissues revealed that CLLD8 is widely expressed as a 3.3-kb transcript with the highest levels in heart, testis, ovary, HL-60, HT, and DB (Fig. 2). BLAST analysis of the protein databases revealed significant homology with human SETDB1 (45% identity and 61% similarity; Ref. 31). BLAST and CD searches revealed that CLLD8 contains three conserved domains: a methyl-CpG binding domain, a preSET domain, and a SET domain. The positions of these domains are illustrated in Fig. 5. Fig. 6,A shows the alignment of the methyl-CpG binding domain of CLLD8 with that of SETDB1, MBD3 (32), and MeCP2 (33). Fig. 6,B shows the alignment of the preSET domain with CLLD8, SETDB1, and SUV39H1 (34). The CLLD8 preSET domain is immediately NH2-terminal to the SET domain and contains the conserved cysteine residues. Fig. 6 C shows the alignment of SET domains of CLLD8, SETDB1, SUV39H1, ALL-1 (35), and Pisum sativum Rubisco ls-MT (36). CLLD8 contains a large insertion of 218 amino acids in the middle of its SET domain. SET domains with insertions at similar positions, which are named bifurcated SET domains (31), are found in SETDB1 and Rubisco ls-MT. These insertions vary in size and show no homology with one another. SETDB1, which has a methyl-CpG binding domain, a preSET domain, and a SET domain in the same position as those on CLLD8, is the human protein most closely related to CLLD8.
Mutation Analysis of CLLD6, CLLD7, and CLLD8 Genes in Human Hematopoietic Tumor Cell Lines and B-CLL Samples.
Mutation analysis of the CLLD6, CLLD7, and CLLD8 coding regions was carried out by direct sequencing of genomic PCR products. A total of 64 cancers including 30 human hematopoietic tumor cell lines and 34 B-CLL tumor samples were analyzed. Results of the analyses are summarized in Table 1. No point mutations were detected in any of the tumor samples we analyzed. Several single-nucleotide polymorphisms were found in CLLD7 and CLLD8. Two nonsynonymous polymorphisms were found in CLLD7; a Thr-500→Ile (T→C) was observed in heterozygous form only in the HUNS-1 cell line. This amino acid substitution involved the substitution of a polar with a non-polar amino acid. Three nonsynonymous polymorphisms were found in CLLD8: (a) a Gly 117→Glu (G→A) substitution represented a change from a non-polar amino acid to a charged polar amino acid; (b) a Lys 408→Ile (A→T) was observed in CA46 cell lines and one B-CLL sample and affected one allele. This amino acid substitution also involved the substitution of a polar with a non-polar amino acid; and (c) Met 473→Val (A→G) substitutes a sulfur containing amino acid with an aliphatic amino acid. In addition to non-synonymous polymorphisms, eight synonymous polymorphisms of CLLD7 and one synonymous polymorphism of CLLD8 were also found.
Discussion
We constructed a high-resolution physical map of the 13q14 region extending 300 kb centromeric from D13S273. This map was constructed based on six BAC clones spanning ∼770 kb. Five genetic markers were exactly mapped of which D13S273 was the most centromeric. No other genetic markers are known within the 300 kb centromeric to D13S273 presented in this work. The closest marker on the radiation hybrid map is D13S165, and it is not within this sequence. We also mapped the five known genes precisely. LEU1, LEU2, and LEU5 were located between D13S1150 and D13S319 within the previous critical deleted region. As reported previously (16), LEU1 and LEU2 map very close to each other, with only 194 bp between them. The Karyopherin α3 (KPNA3) was mapped near D13S273. KPNA3 consists of 17 exons, and D13S273 is located within the second intron of KPNA3. Furthermore, we mapped the NY-REN-34 antigen to the 185C18 BAC. This gene was isolated using serological analysis of recombinant cDNA expression libraries of four renal cancer patients (25). Although the role of the NY-REN-34 antigen in cancers remains unclear, the normal protein is likely to play an important role in control of cell growth because it contains a PHD finger domain, one of the zinc finger domains implicated in chromatin-mediated transcriptional control (37).
CLLD8, NY-REN-34 antigen, and CLLD7 are closely clustered within the most centromeric 150 kb of our map, making this a gene-rich region. The intervals between CLLD8 and NY-REN-34 antigen and between NY-REN-34 antigen and CLLD7 were only 5 and 3 kb, respectively.
CLLD6 is widely expressed in human cancer cell lines as well as in normal human tissues. The predicted amino acid sequence revealed no homology with known proteins that could help to elucidate a possible function for CLLD6 protein. A search of the dbEST databases excluding human ESTs sequences revealed significant homology with mouse, rat, zebrafish, and Xenopus ESTs (GenBank accession numbers AI006064, AA848948, AW595583, and AW644208). This result suggests that CLLD6 is well conserved, at least in vertebrates. Its widespread tissue distribution and cross-species conservation suggests that it is likely to carry out important cellular processes.
CLLD7 showed significant homology with RLG, which maps at <170 kb telomeric to RB1 at 13q14 (26). Genomic structures were also similar with splicing sites located at precisely the same residues within the coding sequences. The expression patterns of CLLD7 and RLG are also similar. These results suggest that these genes might have resulted from gene duplication. CLLD7 contains two conserved domains, an RCC1 domain and a BTB domain. RCC1 was cloned by virtue of its ability to complement the temperature sensitivity of the hamster cell line tsBN2, which shows premature chromosome condensation at the G1 phase of the cell cycle at the nonpermissive temperature (28). Biochemically, RCC1 catalyzes a guanine nucleotide-exchange on the nuclear Ras-like G protein Ran (38). Ran, like Ras, is thought to function as a biological switch. Furthermore, RCC1 was shown to be essential for cell cycle regulation, condensation of chromatin (39), and formation of the mitotic spindle (40). In addition, RCC1, through the Ran pathway, is also involved in nucleocytoplasmic transport (39). Several RCC1-like proteins have been described, among which human P532, a Brefeldin A-sensitive Golgi protein, functions as a guanine nucleotide-exchanging factor for ARF1 (41), and Ceb1, which interacts with cyclins and its expression, appears to be regulated by tumor suppressor proteins RB1 and p53 (27). Structurally, by X-ray crystallography, RCC1 consists of a seven-bladed propeller formed from internal repeats of 51–68 residues/blade (42). An alignment of the seven repeats based on the three-dimensional structure shows seven highly conserved and six invariant residues. These conserved residues probably play an essential role in determining the structure and function of RCC1 and RCC1-like proteins. CLLD7 contains six of the seven repeats, which Devilder et al. (26) suggest are necessary and sufficient to maintain the propeller structure in these proteins.
The COOH terminus of CLLD7 is composed of a BTB domain. The BTB domain was first identified in three Drosophila zinc finger proteins, Broad-Complex, tramtrack and bric à brac, and is generally found at the NH2 terminus of zinc finger proteins (43). Most of these proteins, such as BCL6 and PLZF, are transcriptional repressors, and the BTB domain has been found to be required for transcriptional repression. The BTB domain acts as a specific protein-protein interaction domain mediating homomeric as well as heteromeric interactions (44). Furthermore, the BTB domain also mediates heterophilic interaction. The BTB domains of BCL6 and PLZF interact with the mSin3A and the corepressors N-CoR and SMRT in the process of HDAC recruitment for chromatin remodeling (45, 46). The crystal structure of the BTB domain reveals a tightly intertwined dimer with an extensive hydrophobic interface, which may be important for BTB-BTB domain interactions, as well as surface features in the dimer, which may be involved in interactions with other protein (47). We speculate that CLLD7 is a guanine-nucleotide exchange protein involved in cell cycle regulation by chromatin remodeling through BTB domain-mediated interactions with itself or with other proteins, such as the closely related RLG.
CLLD8 shows the highest homology to SETDB1, and the type and position of domains along both SETDB1 and CLLD8 suggest that these proteins might be members of the same family. Although the function of the three conserved domains of CLLD8 and SETDB1, the methyl-CpG binding, preSET, and SET domains, is generally known, the precise cellular role of SETDB1 remains unknown. CLLD8 contains three conserved domains: a methyl-CpG binding domain, a preSET domain, and a SET domain. DNA methylation in vertebrates is associated with alterations in chromatin structure and silencing of gene expression (48, 49). The MBD was defined as the minimal region required for binding to methylated DNA by MeCP2, which binds specifically to methylated DNA (50). The MBD can recognize a single symmetrically methylated CpGs either as naked DNA (50) or within chromatin (51). Recent studies have focused on chromatin modification by methyl-CpG-specific transcriptional repressors. It has been generally thought that the deacetylation of histone by HDAC leads to repression of transcription (52). Because HDAC functions as an enzyme without observable preference for a specific DNA sequence environment, HDAC associates with corepressors such as mSin3A and Mi2/NuRD, which are brought to DNA through interaction with sequence-specific transcription factors (53). MeCP2 interacts with a component of the mSin3A/HDAC complex through the transcriptional repression domain and will recruit the mSin3A/HDAC complex to promoters by binding methylated DNA through MBD to repress transcription (54, 55). This observation suggests that histone modification has important roles in regulating transcription of methylated DNA. In addition to its MBD, CLLD8 contains a SET domain at the COOH terminus. The SET domain is an evolutionarily conserved sequence motif present in chromosomal proteins from yeast to mammals (56). The SET domain was first identified in three Drosophila chromosomal regulators, Su(var)3–9, Enhancer of zeste (E[z]), and trithorax (TRX; Ref. 57), and is now known to be present in many transcriptional regulators from different species (58). The SET protein family is divided into four subgroups according to amino acid identity within their SET domains: E[z], TRX, ASH1, and Su(var)3–9 subgroups (56). CLLD8 belongs to Su(var)3–9 subgroups because its SET domain is more similar to those of human G9A (34% identity and 49% similarity), Schizosaccharomyces pombe Clr4, and Drosophila Su(var)3–9. In addition to a SET domain, several SET domain proteins contain a preSET domain also known as SAC domain (for SET domain-associated cysteine-rich domain), which is a cysteine-rich region located immediately preceding the SET domain and which might be involved in chromosome binding (59, 60). Several studies suggest that SET proteins can modulate transcriptional activity through chromatin remodeling. Recently, Rea et al. (61) reported that SUV39H1 [human homologue of Drosophila Su(var)3–9] is an H3 histone methyltransferase, which might participate in the induction and assembly of higher-order chromatin. Mutational analysis of SUV39H1 suggested that histone methyltransferase activity toward free histones requires the combination of SET domain with adjacent cysteine-rich regions. CLLD8 belongs to the Su(var)3–9 subgroup the same as SUV39H1 and also contains a SET domain with adjacent cysteine-rich regions. CLLD8 contains a bifurcated SET domain, and it differs from most of SET domain proteins such as SUV39H1. Instead, several plant methyltransferases (62), which have been classified as potential histone lysine N-methyltransferases (36), contain an insertion of about 100 amino acids in the middle of the SET domain at precisely the same position as CLLD8. These observations raise the question of whether CLLD8 might be a histone methyltransferase. As mentioned above, we speculate that CLLD8, which has both an MBD and a bifurcated SET domain with adjacent cysteine-rich regions, may be associated with a methylation-mediated transcriptional repression process that, in turn, involves histone methylation.
Thus, both CLLD7 and CLLD8 are candidate transcriptional regulators through chromatin remodeling activity. Other proteins involved in chromatin modification are also involved in many chromosomal rearrangements observed in leukemias and lymphomas (63). In fact, BCL6 and PLZF, both of which contain a BTB domain, were initially identified as human oncogenes activated by chromosomal translocations. BCL6 is activated by translocations in B-cell non-Hodgkin’s lymphoma (29). PLZF is rearranged in PML with t(11;17)(q23;q21) translocations (30). The ALL-1 gene, which contains a SET domain, is rearranged by translocations in acute leukemias of the lymphoid and myeloid lineages and of the mixed lineage with 11q23 abnormalities (64, 65). ALL-1, the homologue of the Drosophila trithorax gene, is a chromatin-associated protein that provides a regulatory function necessary for expression of target genes. Recently, it has been shown that ALL-1 interacts with the INI1 and SNR1 proteins, which are components of the ATP using chromatin remodeling complex SWI/SNF (66). ALL-1 may also regulate transcription activity through chromatin remodeling. We suggest that CLLD7 and CLLD8 are associated with chromatin modification. Thus, it is possible that CLLD7 and CLLD8 participate directly or indirectly in leukemogenesis.
To test these hypotheses, the mutation analysis of the CLLD6, CLLD7, and CLLD8 genes was carried out. Thus far, neither homozygous deletions nor point mutations were detected. However, we detected several nonsynonymous polymorphisms in CLLD7 and CLLD8. In the absence of any biochemical or biological test, it is not clear whether the amino acid changes reported here in CLLD7 and CLLD8, in particular the Lys 408→Ile (A→T) substitution within the bifurcated SET domain of CLLD8, contribute to the leukemic phenotype. Further studies to assess quantitative and/or qualitative changes in the expression and function of these genes will clarify their role in leukemogenesis.
A, physical map encompassing the B-CLL critical region in 13q14. Top, alignment of BAC clones to show positions and overlaps. Center, position of genetic markers. Bottom, position of genes on the map. B, genomic structure of CLLD6, CLLD7, and CLLD8.
A, physical map encompassing the B-CLL critical region in 13q14. Top, alignment of BAC clones to show positions and overlaps. Center, position of genetic markers. Bottom, position of genes on the map. B, genomic structure of CLLD6, CLLD7, and CLLD8.
Northern blot analysis of CLLD6, CLLD7, and CLLD8. Lane 1, heart; Lane 2, brain; Lane 3, placenta; Lane 4, lung; Lane 5, liver; Lane 6, skeletal muscle; Lane 7, kidney; Lane 8, pancreas; Lane 9, spleen; Lane 10, thymus; Lane 11, prostate; Lane 12, testis; Lane 13, ovary; Lane 14, small intestine; Lane 15, colon; Lane 16, peripheral blood leukocyte; Lane 17, lymph node; Lane 18, bone marrow; Lane 19, fetal liver; Lane 20, promyelocytic leukemia HL-60; Lane 21, HeLa cell S3; Lane 22, chronic myelogenous leukemia K-562; Lane 23, lymphoblastic leukemia MOLT-4; Lane 24, Burkitt’s lymphoma Raji; Lane 25, colorectal adenocarcinoma SW480; Lane 26, lung carcinoma A549; Lane 27, melanoma G361; Lane 28, MC/CAR; Lane 29, HUNS-I; Lane 30, HT; Lane 31, DB; Lane 32, SR; Lane 33, RPMI6666; Lane 34, Hs445.
Northern blot analysis of CLLD6, CLLD7, and CLLD8. Lane 1, heart; Lane 2, brain; Lane 3, placenta; Lane 4, lung; Lane 5, liver; Lane 6, skeletal muscle; Lane 7, kidney; Lane 8, pancreas; Lane 9, spleen; Lane 10, thymus; Lane 11, prostate; Lane 12, testis; Lane 13, ovary; Lane 14, small intestine; Lane 15, colon; Lane 16, peripheral blood leukocyte; Lane 17, lymph node; Lane 18, bone marrow; Lane 19, fetal liver; Lane 20, promyelocytic leukemia HL-60; Lane 21, HeLa cell S3; Lane 22, chronic myelogenous leukemia K-562; Lane 23, lymphoblastic leukemia MOLT-4; Lane 24, Burkitt’s lymphoma Raji; Lane 25, colorectal adenocarcinoma SW480; Lane 26, lung carcinoma A549; Lane 27, melanoma G361; Lane 28, MC/CAR; Lane 29, HUNS-I; Lane 30, HT; Lane 31, DB; Lane 32, SR; Lane 33, RPMI6666; Lane 34, Hs445.
A, sequence comparison between CLLD7 and RLG (67% identity and 79% similarity). Black shading, identical or conserved residues found in all proteins at a given position. ∗, the residues of nonsynonymous polymorphisms. NH2-terminal box, RCC1 domain. COOH-terminal box, BTB domain. B, conserved domains within CLLD7.
A, sequence comparison between CLLD7 and RLG (67% identity and 79% similarity). Black shading, identical or conserved residues found in all proteins at a given position. ∗, the residues of nonsynonymous polymorphisms. NH2-terminal box, RCC1 domain. COOH-terminal box, BTB domain. B, conserved domains within CLLD7.
A, alignment of the RCC1 domain of CLLD7, RLG (72% identity and 83% similarity), Ceb1, and RCC1. Black shading, highly conserved or invariant residues in each repeat. B, alignment of the BTB domain of CLLD7, RLG (68% identity and 86% similarity), BCL6, and PLZF. Black shading, identical or conserved residues found in all proteins at a given position. Gray shading, identical or conserved residues found in at least 75% of the proteins at a given position.
A, alignment of the RCC1 domain of CLLD7, RLG (72% identity and 83% similarity), Ceb1, and RCC1. Black shading, highly conserved or invariant residues in each repeat. B, alignment of the BTB domain of CLLD7, RLG (68% identity and 86% similarity), BCL6, and PLZF. Black shading, identical or conserved residues found in all proteins at a given position. Gray shading, identical or conserved residues found in at least 75% of the proteins at a given position.
A, amino acid sequence of CLLD8. Light gray shading, MBD; dark gray shading, preSET domain; black shading, bifurcated SET domain. ∗, the residues of nonsynonymous polymorphisms. B, conserved domains within CLLD8.
A, amino acid sequence of CLLD8. Light gray shading, MBD; dark gray shading, preSET domain; black shading, bifurcated SET domain. ∗, the residues of nonsynonymous polymorphisms. B, conserved domains within CLLD8.
A, alignment of the MBD domain of CLLD8, SETDB1 (40% identity and 63% similarity), MBD3, and MeCP2. Black shading, identical or conserved residues found in all proteins at a given position; gray shading, identical or conserved residues found in at least 75% of the proteins at a given position. B, alignment of the preSET domain of CLLD8, SETDB1 (40% identity and 54% similarity), and SUV39H1. C, alignment of the SET domain of CLLD8, SETDB1 (54% identity and 73% similarity), SUV39H1, ALL-1, and Rubisco ls-MT.
A, alignment of the MBD domain of CLLD8, SETDB1 (40% identity and 63% similarity), MBD3, and MeCP2. Black shading, identical or conserved residues found in all proteins at a given position; gray shading, identical or conserved residues found in at least 75% of the proteins at a given position. B, alignment of the preSET domain of CLLD8, SETDB1 (40% identity and 54% similarity), and SUV39H1. C, alignment of the SET domain of CLLD8, SETDB1 (54% identity and 73% similarity), SUV39H1, ALL-1, and Rubisco ls-MT.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Supported by Program Project Grants P01CA76259, P01CA81534, and P30CA56036 from the National Cancer Institute and by a Kimmel Scholar Award (to F. B.).
The abbreviations used are: B-CLL, B-cell chronic lymphocytic leukemia; BAC, bacterial artificial chromosome; CLLD6, CLLD7, or CLLD8, CLL deletion gene 6, 7, or 8, respectively; EST, expressed sequence tag; RACE, rapid amplification of cDNA ends; RLG, RCC1-like G exchanging factor; Rubisco ls-MT, ribulose-1,5-bisphosphate carboxylase/oxygenase large subunit N-methyltransferase; MBD, methyl-CpG binding domain; HDAC, histone deacetylase.
GenBank accession numbers mentioned in this work are: 369L16, AF334404; 317G11, AC069475; LEU1, Y15227; LEU2, Y15228; LEU5, AJ224819; Karyopherin α3, NM_002267; NY-REN-34 antigen, NM_016119; CLLD6, AF334405; CLLD7, AF334406; and CLLD8, AF334407. GenPept accession numbers mentioned in this work are: RLG, NP_001259; Ceb1, NP_057407; RCC1, NP_001260; BCL6, P41182; PLZF, NP_005997; SETDB1, NP_036564; MBD3, NP_003917; MeCP2, NP_004983; SUV39H1, NP_003164; ALL-1, CAA93625; Rubisco ls-MT, S53005; human G9A, S30385; S. pombe Clr4, CAA07709; and Drosophila Su(var)3–9, S47004.
Internet address: http://www.ncbi.nlm.nih.gov/BLAST.
This program is available via http://www.psc.edu/biomed/genedoc.
Sequence analysis in hematopoietic cell cells and B-CLL tumor samples
Genes . | Amino acid and nucleotide changes . | Hematopoietic cell lines . | B-CLL tumor samples . |
---|---|---|---|
Nonsynonymous polymorphisms | |||
CLLD7 | Val 4→Ala | 6/29 | 14/32 |
(T→C) | (heterozygous 9/29) | (heterozygous 7/32) | |
CLLD7 | Thr 500→Ile | 0/30 | 0/34 |
(C→T) | (heterozygous 1/30) | (heterozygous 0/34) | |
CLLD8 | Gly 117→Glu | 3/30 | 4/34 |
(G→A) | (heterozygous 14/30) | (heterozygous 10/34) | |
CLLD8 | Lys 408→Ile | 0/30 | 0/34 |
(A→T) | (heterozygous 1/30) | (heterozygous 1/34) | |
CLLD8 | Met 473→Val | 2/30 | 4/34 |
(A→G) | (heterozygous 9/30) | (heterozygous 14/34) | |
Synonymous polymorphisms | |||
CLLD7 | Ile at 18, Ile at 116, Val at 133, Leu at 215, Ser at 317 | ||
(C→T)(T→C) (A→G) (T→C) (C→A) | |||
CLLD7 | Asp at 330, Thr at 338, Pro at 339 | ||
(C→T) (T→C) (G→C) | |||
CLLD8 | Gln at 484 | ||
(A→G) |
Genes . | Amino acid and nucleotide changes . | Hematopoietic cell lines . | B-CLL tumor samples . |
---|---|---|---|
Nonsynonymous polymorphisms | |||
CLLD7 | Val 4→Ala | 6/29 | 14/32 |
(T→C) | (heterozygous 9/29) | (heterozygous 7/32) | |
CLLD7 | Thr 500→Ile | 0/30 | 0/34 |
(C→T) | (heterozygous 1/30) | (heterozygous 0/34) | |
CLLD8 | Gly 117→Glu | 3/30 | 4/34 |
(G→A) | (heterozygous 14/30) | (heterozygous 10/34) | |
CLLD8 | Lys 408→Ile | 0/30 | 0/34 |
(A→T) | (heterozygous 1/30) | (heterozygous 1/34) | |
CLLD8 | Met 473→Val | 2/30 | 4/34 |
(A→G) | (heterozygous 9/30) | (heterozygous 14/34) | |
Synonymous polymorphisms | |||
CLLD7 | Ile at 18, Ile at 116, Val at 133, Leu at 215, Ser at 317 | ||
(C→T)(T→C) (A→G) (T→C) (C→A) | |||
CLLD7 | Asp at 330, Thr at 338, Pro at 339 | ||
(C→T) (T→C) (G→C) | |||
CLLD8 | Gln at 484 | ||
(A→G) |
Acknowledgments
We thank Dr. Tatsuya Nakamura for helpful comments and critical reading of the manuscript. We also thank Masayoshi Shimizu and Shashi Rattan for excellent technical assistance.