The ability to induce pluripotent stem cells from committed, somatic human cells provides tremendous potential for regenerative medicine. However, there is a defined neoplastic potential inherent to such reprogramming that must be understood and may provide a model for understanding key events in tumorigenesis. Using genome-wide assays, we identify cancer-related epigenetic abnormalities that arise early during reprogramming and persist in induced pluripotent stem cell (iPS) clones. These include hundreds of abnormal gene silencing events, patterns of aberrant responses to epigenetic-modifying drugs resembling those for cancer cells, and presence in iPS and partially reprogrammed cells of cancer-specific gene promoter DNA methylation alterations. Our findings suggest that by studying the process of induced reprogramming, we may gain significant insight into the origins of epigenetic gene silencing associated with human tumorigenesis, and add to means of assessing iPS for safety. Cancer Res; 70(19); 7662–73. ©2010 AACR.
Induction of pluripotent stem cells (iPS) from committed, somatic cells encompasses exciting biology with tremendous potential for regenerative medicine (1, 2). However, such cells are currently generated using multiple inducing factors with oncogenic potential (3–6), and mice generated with iPS have increased tumorigenicity and mortality (7). Moreover, fully reprogrammed human iPS phenotype arise only as rare clonal populations (≤0.1%) among partially reprogrammed cells some of which display an immortalized phenotype (8). Thus, many new metrics are needed to characterize iPS (4, 9).
Induction of iPS from committed cells necessitates that a myriad of epigenetic parameters must be reversed and properly reestablished. Genome-wide studies of chromatin and DNA methylation patterns (10, 11) indicate that this is globally done remarkably well including required reversal of promoter DNA methylation and reexpression of pluripotency-related genes such as OCT4 (POU5F1), both of which mark successful reprogramming (10, 12). However, multiple loci can fail in this reversal of DNA methylation (10, 12), and use of drug-induced DNA demethylation can improve efficiency of obtaining iPS (10, 13, 14). Another implication of such abnormal loci is that the relatively short duration of the reprogramming process (2–3 weeks; refs. 1, 2, 4, 10) might implicate epigenetic mechanisms in the neoplastic potential of iPS. A recent study compared iPS to cancer (15), but the most cancer-specific, proximal promoter epigenetic changes accompanying abnormal gene expression were not outlined. We now address this issue using genome-wide analysis approaches to match gene silencing associated with DNA hypermethylation of normally unmethylated CpG island–containing promoters (16). We identify cancer-related epigenetic abnormalities associated with the timing, and degree, of inducing cellular reprogramming.
Materials and Methods
MP2 and MP4 cells were reprogrammed from IMR90 fibroblasts by lentiviral vectors, and retroviral vectors were used for all other protocols (17).
Global gene expression was analyzed using Agilent whole human genome, 4x44K, microarrays as previously described (18). Heat map color schemes are based on hierarchical clustering by Ward's algorithm and standard Euclidean distance on log-transformed expression measures adjusted for cutoffs for silent (red) and active (green) genes as previously determined (19).
Genome-wide DNA methylation analysis
We used the Infinium (Illumina, Inc.) platform (20) to analyze bisulfite-treated DNA (EZ DNA Methylation kit, Zymo Research). In this hybridization procedure, β values are generated as the signal of methylation-specific probe over the sum of the signals of the methylation- and unmethylated-specific probes and assigning a score of 1.0 for full methylation of a specific CpG site, 0 for absence of methylation, and 0 ≤ β ≤ 1 for all signals between (21). Probes with poor overall signals (P > 0.05) were removed from analysis. For all genes, only probes positioned from −1,000 to +200 bp around transcription start site (TSS) are analyzed. In vitro, DNA-methylated, genomic DNA (IVD) and DNA from DKO cells genetically deleted for DNA methyltransferases (DNMT) 1 and 3b (22) serve as fully methylated and unmethylated controls, respectively. Heat maps are based on hierarchical clustering of β values using Euclidean distance and Ward's algorithm, and all probes were mapped to the genome (National Center for Biotechnology Information Build 36.3) using the bowtie algorithm and ultrafast and memory-efficient alignment of short DNA sequences (Genome Biology, 10, R25) with genome annotation via the matching release of the Ensembl database. X-linked genes were removed from analyses.
Drug-responsive genes were selected from Agilent expression arrays (see Supplementary Fig. S3) based on a 1.41-fold expression (0.5 on a log2 scale) difference between mock versus 5-aza-2′-deoxycytidine (DAC)– or trichostatin A (TSA)–treated samples and classified as responsive to DAC alone but not TSA, to TSA alone but not DAC, and to both DAC alone and TSA alone.
Characteristics of reprogrammed human cell lines
We first examined tumor xenografts from reprogrammed clones (Supplementary Table S1A) for pluripotent potential and neoplastic features. The highest cancer potential was for clone MP4, a fibroblast line induced by lentiviral introduction of OCT4, SOX2, NANOG, LIN28, and SV40 T-antigen (OSNLT; refs. 1, 17) and known to be only partially reprogrammed (17). MP4 cells express undifferentiated markers such as OCT4 but not TRA-1-60, are refractory to differentiation induction in vitro and in vivo (i.e., nullipotent), and possess an abnormal karyotype. Mouse xenografts show high nuclear to cytoplasmic ratio, an extremely high mitotic rate, areas of necrosis, and the histology of a primitive, aggressive, mesenchymal tumor (Fig. 1A). In comparison, clone MP2, prepared identically to MP4 and a typical iPS expressing TRA-1-60 and all classic iPS markers (17), forms what seems to be a benign, multilineage teratoma (Fig. 1A, top right). However, this clone has an abnormal karyotype, and on closer examination, the teratoma contains small foci suggesting malignancy, including regions where cells infiltrate host skeletal muscle bundles (Fig. 1A, bottom right).
We then performed blinded comparison (B.S. and D.M.B.) of teratomas (Fig. 1; Supplementary Table S1B) from the conventional embryonic stem cell (ESC) lines H1, H9, and SC233 versus from multiple additional pluripotent iPS with normal karyotypes and metrics of fully reprogrammed human iPS (23), including expression of the embryonic markers AP, OCT4, and TRA-1-60. These include MR45 and MR46, generated from IMR90 fibroblasts by retrovirally introducing OCT4, SOX2, KLF4 and c-MYC (OSKM; ref. 2), MB41, MB45, MMW1, and MMW2 generated with the same retroviral vectors but from mesenchymal stem cells (MSC), and two iPS lines (MR31 and MR32) retrovirally derived from IMR90 fibroblasts using three factors (OSK, without c-MYC). All formed mouse xenografts with differentiated cell types of multiple embryonic germ layers (Fig. 1B). However, the successfully reprogrammed iPS cell xenografts show a range of degree of maturation (from 0.067 to 0.231 lineage structures/areas examined) of defined structures such as cartilage, bone, and intestine (Fig. 1C), which is lower than the range of values (0.338–0.97) for three ESC-derived teratomas (Fig. 1B and C; Supplementary Table S1B). Most importantly, all iPS teratomas examined have foci with malignant-like characteristics, which include focal necrosis, nuclear pleomorphism, aberrantly high mitotic rates, and infiltration into the mouse musculature (examples shown in Fig. 1C; Supplementary Table S1B). Such foci were generally not found in the three ESC-derived teratomas (Supplementary Table S1B), save for one small area of focal necrosis in the teratoma derived from H1 ESC.
Overall gene expression profiles of reprogrammed clones
We compared, using full transcriptome Agilent arrays, global gene expression between representative reprogrammed clones versus ESCs and found, as have others (10), highly similar, but not identical, patterns. Even the partially reprogrammed MP4 clone deviated only slightly from the pluripotent clones and ESC and clustered separately from four human cancer cell lines (Supplementary Fig. S1). However, focus on genes highly expressed in ESC revealed distinct differences. Compared with IMR90 fibroblasts pluripotent clones MR46 and MR45 significantly upregulated to levels in ESC, 12 of 16 and 13 of 16 such genes, respectively (Supplementary Table S3), MP2 increased only 11 of 16 above levels in parent fibroblasts, and only 8 to levels in ESC. The malignant, partially reprogrammed MP4 properly increased only 5 of the 16 genes with respect to starting fibroblasts and only 6 reached levels in ESC. In addition, partially reprogrammed clone, MP4, expressed both OCT4 and c-MYC due to incomplete repression of introduced transgenes, significantly higher than in ESC (Supplementary Table S3).
Abnormal gene silencing and gene responses to epigenetic-modifying agents associated with cellular reprogramming
We next linked, in reprogrammed clones, gene expression to epigenetic aspects of neoplasia with respect to abnormal gene silencing events during reprogramming. In pluripotent clones MR46, MR45, and MP2, normal silencing events (Fig. 2A, left) dominate over abnormal silencing events for both genes that should be silent in fibroblasts but activated in ESC, or which should be active in both cell types (Fig. 2A, middle and right), whereas the opposite is true for the tumorigenic MP4 cells (Fig. 2B and C; Supplementary Table S4). However, whereas MP4 has the highest number of abnormal silencing events (∼800), even the best-performing iPS clone, MR46, has ∼100 such genes (Fig. 2B and C, combined; Supplementary Table S4). These numbers may be characteristic of iPS because analysis of the best-reprogrammed mouse iPS clone [MCV8.1, subclone 8.1 (24) from a recent study of Mikkelsen and colleagues (10)] revealed 418 abnormally silenced genes (Supplementary Fig. S2).
In cancer, DNA hypermethylation of CpG island promoters (16) is a prime candidate for mediating abnormal gene silencing, and several abnormally silenced genes in reprogrammed clones are known to have such changes, including CDKN2B (Supplementary Ref. 1), LXN (Supplementary Refs. 2, 3), TIMP3 (Supplementary Refs. 2, 4), and PYCARD (Supplementary Ref. 5). We thus investigated global gene expression responses to both the DNA demethylation agent DAC and the histone deacetylase (HDAC) inhibitor TSA. DAC effectively reexpresses genes in cancer with densely hypermethylated, promoter CpG islands, whereas TSA alone does not (see Supplementary Fig. S3 for array expression responses; ref. 18). The results strikingly separate cancer cells, iPS cells, and especially the partially reprogrammed MP4 clone from parent fibroblasts, adult MSC, and ESC. Addition of either DAC or TSA (Fig. 2D) reactivates between 67% (MR46) and 84% (MR45) of the abnormal silenced genes in Fig. 2A. A dramatic finding is that, in partially reprogrammed clone MP4, and to a lesser degree other iPS (Fig. 2D), more induced silent genes are responsive only to DAC alone, or both DAC and TSA (left to right, Fig. 2D), but not to TSA alone, and this is also true, to a slightly less extent (Fig. 2E), for abnormal retained silencing genes (those in Fig. 2A, right). This is true for both CpG island and non-CpG island–containing genes (Fig. 3A and B, respectively). The pattern for MP4 cells begins to resemble that for the cancer cell lines, which have an extraordinary dominance of DAC alone– versus TSA alone–responsive genes. Moreover, genes in normal ESC, adult MSC, and committed bone progenitor cells (osteoblasts) have a starkly dominant response to TSA alone (Fig. 3A and B). These patterns also hold when ∼400 CpG island–containing genes responsive to DAC alone in the U2OS osteosarcoma cells (Fig. 3C) and HT1080 fibrosarcoma cells (Fig. 3D) serve as their own controls (i.e., for these genes, TSA responses dominate in normal cells, DAC responses dominate in cancer cells, and the reprogrammed clones show a mixed response to DAC and TSA consisting of a predominantly DAC response most apparent in the MP4 clone). Finally, normal committed IMR90 fibroblasts have more of a mixture of DAC- and TSA-responsive genes, but with responses still skewed toward TSA (Fig. 3A and B), and the increased frequency of DAC-responsive genes even separates all of the fully reprogrammed clones from ESC and adult MSC (Fig. 3A and B).
Taken together, the data indicate that reprogrammed somatic cells can acquire both abnormal gene silencing events and aberrant responses to epigenetic-modifying drugs that deviate from normal and are reminiscent of changes observed in cancer cells.
Cancer-related promoter DNA methylation changes arise early during reprogramming
To further examine abnormal gene silencing and cancer-specific promoter DNA methylation on a global scale, we used the Infinium platform (25), which queries ∼27,000 CpG sites, most, but not all, in annotated promoter CpG islands. When focused on analyzing CpGs located from −1,000 to +200 bp from gene TSS, we find, as have others in genome-wide DNA methylation studies (10, 26, 27), that the vast majority (∼90%) of ∼7,500 loci in ∼5,700 different genes with well-annotated promoter CpG islands are unmethylated in all normal cells (ESC, MSC, and fibroblasts; Supplementary Fig. S4). In contrast, for ∼3,750 probes in ∼800 autosomal genes not containing dense CpG island promoters, CpG poor promoters are far more methylated, with a varied tissue pattern, in normal cells (Supplementary Fig. S5), consistent with studies of others (12, 26) and verifying a long-held biological premise. Finally, consistent with our previous studies (18), gene promoter CpG island hypermethylation in cancer is starkly apparent (Supplementary Fig. S4). The number of unmethylated CpG island promoters falls to ∼78% in four adult cancer cell lines and to ∼87% in two germline teratocarcinoma lines representing 900 hypermethylated genes in the former and >200 in the latter. Non-CpG island promoters have not been as carefully examined in cancer, but the cancer cell lines still cluster separately from normal cells largely because of multiple genes that have gained methylation but also due to many that have lost normal methylation (Supplementary Fig. S5). Importantly, for cancer cell lines such as HCT116 colorectal cells, the Infinium analyses correctly identified (data not shown) ∼90% of genes validated in our laboratory to have cancer-specific DNA hypermethylation (18).
With this above background, we examined not only our reprogrammed cell clones but also pools of cells (Supplementary Table S1 for listing) from early stages (days 6–18) following insertion of four factors into both our fibroblast and MSC parent cells and after introduction of OCT4 alone into fibroblasts. In these pools, cells are in a dynamic meta-stable state with a wide range of reprogramming stages (10, 28), wherein most clones do not exhibit ESC morphology and many cells can be cultured indefinitely. Overall, cells in these pools maintain the normally unmethylated state of CpG island promoters, harboring many less abnormalities than seen in cancer cells (Supplementary Fig. S4). However, we observe that among normally unmethylated CpG island genes, 50 show abnormal methylation in one or more of the early cell pools by day 6, and 38 have abnormally added methylation in individual reprogrammed clones (Fig. 4A; Table 1, individual clone genes; Supplementary Table S5, pool genes). Although the genes generally differ between the pools and clones, ∼15% are shared between the two. Of 10 clones examined, including partially reprogrammed MP4 cells, only 2 (MMW1 and MMW2) failed to exhibit a hypermethylated gene, whereas the remainder contained between 3 (MR46) and 17 (MB45). Significantly, >60% of all these genes (P = 3.055 × 10−7, Fisher's exact test) in the clones and 50% (P = 7.302 × 10−5, Fisher's exact test) in the pools are also hypermethylated in one or more cancer cell lines (Fig. 4A; Table 1 and Supplementary Table S5, individual genes). Finally, and very importantly, even pooled cells with introduction of OCT4 alone contained from 12 to 16 hypermethylated genes, and again, ∼50% of these were also hypermethylated in at least one cancer cell line (Fig. 4A; Supplementary Table S5).
NOTE: Dark squares, β values above 0.45; light gray squares, values ranging to 0.45.
The fully reprogrammed iPS clones generally also behave very well with respect to non-CpG island genes. Thus, for all clones, 87% to 95% of such promoters properly either gain or lose DNA methylation properly relative to ESC (Fig. 4B). Interestingly, and as might be expected, most of the genes in these groups do not make the changes required for iPS in the reprogramming pools of fibroblasts or MSC, generally retaining the methylation patterns of the starting parent cells (Fig. 4B). Again, despite the global proper behavior, multiple non-CpG island genes, relative to normal cells, both gain and lose promoter methylation. Thirteen genes abnormally gain methylation in the clones (Supplementary Table S6), and another 13 in the pools (Supplementary Table S7), whereas 56 and 37 lose methylation respectfully (Supplementary Tables S6 and S7). Particularly, MP4 contains 37 abnormal genes: 5 having abnormal gain and 32 having abnormal losses of methylation (Fig. 4B; Supplementary Tables S6 and S7). A high percentage, 60% for the clones (P = 1.645 × 10−12, Fisher's exact test) and 43% for the pools (P = 5.141 × 10−8), of the abnormal gains and losses of DNA methylation also appear in one or more of the cancer cell lines (Supplementary Tables S6 and S7). Finally, for MP4, only 35% of non-CpG island genes properly gained or lost DNA methylation relative to ESC.
Importantly, many of the above abnormal CpG island genes are hypermethylated in primary human cancers and have roles in malignancy and/or embryonic cell fate patterning. GATA4 (Supplementary Ref. 6), central to proper endodermal differentiation, is hypermethylated in multiple cancer types; O6-MGMT (Supplementary Refs. 7, 8) is also hypermethylated in cancer and loss of function impedes DNA repair; TCERG1L has been recently identified as a low-frequency mutated gene in colon cancer (Supplementary Ref. 9) and is DNA hypermethylated in virtually all such tumors12
12J.-M. Yi, N. Ahuja, S.B. Baylin, L. Van Neste, J.G. Herman, unpublished data.
For selection of these important genes, we verified Infinium results by performing the methylation-specific PCR (MSP) assay (29), querying four to six CpG sites positioned near the Infinium probes yielding hypermethylation values. Although Infinium probes query small numbers of CpGs in the islands, and development of abnormal methylation may be quite partial over the short time of reprogramming, 7 of 10 genes were methylated by MSP, and at one of two promoter sites queried, TCERG1L is fully methylated in iPS clones MP2 and MR45 (Supplementary Fig. S6A and B).
Abnormally silenced genes, chromatin, and cellular reprogramming
The small fraction of DAC-responsive (Supplementary Fig. S3), silenced genes in iPS and reprogramming pools that are DNA methylated in reprogrammed cells is surprising. Although this could represent failure to detect very partial DNA methylation in queried promoters, it may involve links between abnormal gene silencing, DNA methylation in cancer, and chromatin states of embryonic cells. We (30) and others (31, 32) have found that high percentages of DNA hypermethylated genes in cancer are marked by polycomb group silencing proteins (PcG) in embryonic cells. Such genes may have an abnormal progression from PcG marking to promoter DNA hypermethylation in cancer (27, 30). In ESC, such promoters are not methylated but rather maintained in a low poised expression state by PcG regulation (33–35) and “bivalent” chromatin constituted by transcriptional positive (H3K4me2 and me3) and negative (the PcG mark, H3K27me3) histone modifications (36). Interestingly, abnormally low expression genes in partially reprogrammed cells from mice have bivalent chromatin (28), and aggressive human cancers have increased expression of PcG genes (37) to levels such as those found in our iPS cells (Supplementary Table S2).
We thus queried the promoter occupancy of our abnormally silenced genes in iPS clones, reprogramming pools, and cancer cell lines (Table 1; Supplementary Tables S4 and S5) in databases of others for embryonic cells (37–40). Although there is no significant enrichment at the promoters of these genes for occupancy by the iPS reprogramming factors themselves (37), save for NANOG at the promoters of the DNA hypermethylated genes in Table 1 and Supplementary Table S5, in the clones and partially reprogrammed pools, there is an enrichment for PcG promoter marking (Fig. 5A). This is confirmed by local chromatin immunoprecipitation (ChIP) of the PcG mark H3K27me3 for promoter regions of selected genes in ESC, iPS clone MP2, and the partially reprogrammed clone MP4 (Fig. 5B).
It seems that during the reprogramming process, iPS may be prone to abnormal epigenetic changes characteristic of neoplasia. Our results differ from a recent study (15) in which differential changes both between cancer and normal cells, and iPS and normal ESC, were thought confined to “shores” or regions upstream from promoter CpG islands. In contrast, we find that cancer-specific promoter CpG island hypermethylation is easily visualized on the Infinium arrays in hundreds of genes in the cancer cells (Fig. 4; Supplementary Figs. S4 and S5). These changes can be seen, to a lesser degree, in iPS and partially reprogrammed cells (Table 1; Supplementary Table S4; Fig. 4A).
The abnormal gene silencing events in reprogramming could well involve the inducing factors themselves. We find just one essential reprogramming factor, OCT4 (4, 41), that produces cancer-specific epigenetic changes, and this may help explain why its forced overexpression in the germline of mice produces a striking neoplastic phenotype in skin and intestine (42).
A dramatic finding in our study concerns the dominance of gene responsiveness to TSA in ESC but to DAC in cancer cells and iPS. The chromatin of ESC especially exists in a far more open, and histone-acetylated, pattern than in more committed embryonic and adult cells (43). Many genes in ESC may be in a poised state for induction in response to commitment signals and are balanced between very active targeting by histone acetyltransferases and removal of acetylation by HDACs. Blocking HDACs would then shift the dynamics to allow rapid promoter acetylation to enhance expression of such genes. The more dominant response to DAC in iPS, and especially cancer cells, may reflect an abnormally repressed state of genes, which prevents normal cell differentiation and/or full progression of cells during reprogramming to the ESC epigenetic state.
iPS and cancer cells do differ dramatically in that the Infinium assay validates that a majority (>70%) of basally silent genes that respond to DAC alone but not to TSA (i.e., candidate genes in Supplementary Fig. S3) have promoter DNA methylation in cancer cells but not in iPS cells. A key explanation may lie in the potential molecular progression from PcG promoter occupancy to DNA promoter hypermethylation that may be ongoing in cancer as suggested by ourselves and others (27). This progression may involve targeting of DNMTs to involved genes (27) and resultant DAC reexpression of even non–DNA-methylated genes. DNMTs bind corepressors, experimentally can repress gene promoters independently of catalyzing DNA methylation, and can act as scaffolding proteins by using regions distal to their DNA methylation catalytic sites (44–46). To inhibit DNMTs, DAC incorporates into DNA and covalently bind to DNMTs, resulting in removal of DNMT1 from the soluble nuclear protein pool (47) and its degradation (48). Thus, DAC may reexpress genes with or without DNA methylation as indicated by our recent result for overexpressing a polycomb constituent in teratocarcinoma cells (49).
In summary, epigenetic changes may contribute to neoplastic potential of iPS. Mapping these may help provide constructive parameters to monitor the preparation and use of such cells, guide use of epigenetic-modifying drugs to enhance efficiency of obtaining iPS (10), and help derive gene marker panels for the above purposes. Finally, cellular reprogramming may provide a model to study how epigenetic abnormalities may be central to the origins of cancer and whether reprogramming might play a role in the formation of key subpopulations of cancer cells. For example, cellular reprogramming might underlie the fact that neoplasia can initiate even in mature populations of normal cells (50).
Disclosure of Potential Conflicts of Interest
S.B.B. consults for OncoMethylome Sciences. MSP is licensed to OncoMethylome Sciences in agreement with Johns Hopkins University (JHU), and S.B.B. and JHU are entitled to royalty shares received from sales.
We thank Dr. Saul Sharkis and members of the Baylin and Cheng labs for reading the manuscript and for helpful discussions and Kathy Bender for manuscript preparation.
Grant Support: NIH grants CA116160 and UO1 HL099775 (S.B.B.) and State of Maryland TEDCO.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.