Abstract
Pluripotent stem cells, both human embryonic stem cells (hESC) and human-induced pluripotent stem cells (hiPSC), can give rise to multiple cell types and hence have tremendous potential for regenerative therapies. However, the tumorigenic potential of these cells remains a great concern, as reflected in the formation of teratomas by transplanted pluripotent cells. In clinical practice, most pluripotent cells will be differentiated into useful therapeutic cell types such as neuronal, cardiac, or endothelial cells prior to human transplantation, drastically reducing their tumorigenic potential. Our work investigated the extent to which these differentiated stem cell derivatives are truly devoid of oncogenic potential. In this study, we analyzed the gene expression patterns from three sets of hiPSC- and hESC-derivatives and the corresponding primary cells, and compared their transcriptomes with those of five different types of cancer. Our analysis revealed a significant gene expression overlap of the hiPSC- and hESC-derivatives with cancer, whereas the corresponding primary cells showed minimum overlap. Real-time quantitative PCR analysis of a set of cancer-related genes (selected on the basis of rigorous functional and pathway analyses) confirmed our results. Overall, our findings suggested that pluripotent stem cell derivatives may still bear oncogenic properties even after differentiation, and additional stringent functional assays to purify these cells should be done before they can be used for regenerative therapy. Cancer Res; 71(14); 5030–9. ©2011 AACR.
Introduction
Human embryonic stem cells (hESC) are derived from the inner cell mass of a human blastocyst stage embryo (1). They are characterized by their ability to both self-renew and differentiate into all somatic tissues of the embryo. After transplantation into immunosuppressed mice, they spontaneously differentiate and form tumors (teratomas), in which there is disordered differentiation into various tissue types of the early embryo (2). The tumorigenic nature of hESCs has been previously described (3, 4) and is considered a major obstacle to their clinical utilization. Though teratomas may be considered a relatively benign, disorganized bulk of normal embryonic tissues, the formation of a teratoma after hESC transplantation in human patients is entirely unacceptable. The clinical hurdles facing utilization of hESC-based grafts in the clinic are comprehensively discussed elsewhere (5).
In contrast to hESCs, the concept of deriving pluripotent cells from somatic cells by reversing the natural differentiation process that occurs during development has long been explored (6, 7). The real-quantum leap in this effort was finally realized with the generation of human-induced pluripotent stem cells (hiPSC) in 2007 (8, 9), which has freed regenerative therapies from the ethical concerns that are frequently aroused by hESCs. However, hiPSCs are currently generated using multiple inducing factors that may have oncogenic potential (10–13), and it has been shown that mice generated from murine iPSCs have increased tumorigenicity and mortality (14). In addition, the fully reprogrammed hiPSC phenotype arises only as rare clonal populations (≤0.1%) among partially reprogrammed cells (15). Given these issues, the practice of reprogramming adult cells into hiPSCs faces several hurdles that must be overcome before it can have any practical clinical applications. For instance, the efficiency of reprogramming needs to be improved, and hiPSCs need to be generated in a manner that avoids any exogenous sequences that may induce malignancy (13, 16, 17). Similar to hESCs, even a small number of undifferentiated cells may give rise to teratomas after hiPSC transplantation.
Recent studies indicate that stem cells and tumor cells share many common master regulatory genes (18–21). The interwoven nature of pluripotency and tumorigenicity programs is revealed by the molecular machinery shared by them, and it has become a major challenge to untangle the determinants of pluripotency from those responsible for tumorigenicity. This entanglement is exemplified by the fact that many of the genes used to produce hiPSCs are either outright oncogenes such as Myc and Klf4 (22, 23), or are in subtle ways linked to tumorigenesis such as Sox2 (24),Nanog (19), and Oct3/4 (25). Note that hESCs are also defined by the expression of a battery of these genes, mostly Oct4, Nanog, and Sox2 (26).
The primary method for eliminating the problem of tumorigenicity is to induce differentiation of hESCs or hiPSCs into the required cell type prior to transplantation. Although the tumorigenic potential of these pluripotent cells seems to be greatly reduced in vivo when the cells are predifferentiated in vitro, the risk of teratoma formation still exists due to carryover of contaminating pluripotent cells. Furthermore, even with differentiated derivatives that are free of contaminating undifferentiated cells, the tumorigenicity risk still remains, as several reports have shown that transplanted derivatives of ESCs may also produce tumors (27–30). For example, it has been shown that after injection of neural marker-selected derivatives of mouse ESCs into the subretinal space of rhodopsin−/− mice (28), teratomas were formed that caused eye malformation within 2 months after transplantation. Two studies have shown that ESC-derived neural precursors for transplantation into fetal brains of mice (29) and dopaminergic neuron progenitors for transplantation into Parkinsonian rats (28) also showed signs of tumor formation. Another study showed that injection of beating embryoid bodies containing cardiomyocytes into the myocardium still led to teratoma formation, albeit at a later onset compared with undifferentiated ESCs (30). These are examples of different fates that pluripotent stem cell derivatives can assume, and extreme care should be taken to exclude potentially tumorigenic cells from the transplantable population. Studies must be done for each differentiated derivative that is considered for therapy before it can be regarded as safe for clinical application.
To address these issues of tumorigenicity, we carried out a rigorous transcriptional analysis of the different hiPSC- and hESC-derived cell lines and corresponding human primary cells that have been previously reported (31–35). We compared these transcriptomes with different cancer cell lines (36–40) to delineate the tumorigenic gene expression patterns still remaining within these differentiated derivatives. We also conducted a real-time quantitative PCR (qRT-PCR) analysis of a selected panel of cancer genes within these derivatives to confirm the tumorigenic potential of these cells. Compared with their primary cell counterparts, we discovered that the pluripotent stem cell derivatives expressed higher levels of cancer-related genes even after differentiation. These findings show that our understanding of the tumorigenicity of hiPSC- and hESC-derivatives is only rudimentary at present. In the future, caution must be exercised when considering pluripotent stem cell derivatives for regenerative therapies, and methods for purifying these cell populations of undifferentiated contaminants, as well as reducing their innate oncogenic potential, must be established prior to clinical application.
Materials and Methods
Sources of gene profiles
In this study, we analyzed the transcriptional profiles of previously reported hiPSC- and hESC-derived hepatic cell lines (35), hESC-derived endothelial cell lines (Ref. 34; please note that the profiles for hiPSC-derived endothelial cell lines have been newly generated in our laboratory; ref. 41), and hiPSC- and hESC-derived neural crest cell lines (33). We also compared the gene expression data from primary hepatocytes (32), human umbilical vein endothelial cells (HUVEC; ref. 34), and neural crest cell lines (31), respectively. These profiles were then compared with those of a common set of 5 different cancer cell lines (36–40) consisting of breast cancer, myeloid leukemia, glioblastoma multiforme cancer stem cells, prostate cancer, and pancreatic cancer cell lines. The breast cancer cell lines are the rare cancer side population cells isolated from the CAL-51 human mammary carcinoma cell line that display cancer stem cell characteristics (36). Myeloid leukemia stem cells were isolated from patients with acute myeloid leukemia (38). Glioblastoma multiforme samples were isolated from patients undergoing surgical biopsies; nonadherent cellular spheroids derived from these serum-free culture conditions were considered to be enriched cancer stem cell cultures (40). For prostate cancer, cancer stem cell lines derived from prostate cancer cell line PCSC1-3 (from Celprogen) were used (37). Finally, the pancreatic cancer cell lines chosen for this study were Nor-P1, HPAF-II, CaPan-2, BxPC-3, and Panc 2.03 (39).
For the microarray profiling data, all of the hiPSCs from which the differentiated cell lines (hepatocytes, endothelial cells, and neural crest cells) were derived were originally reprogrammed from primary fibroblasts. In addition, all of the hiPSCs used in the microarray analyses were reprogrammed using lentiviral vectors that included oncogenic reprogramming factors such as c-Myc and Klf4. Specifically, the hiPSCs used to derive hepatocytes were reprogrammed from human primary foreskin fibroblasts [CRL2097; American Type Culture Collection (ATCC)] following the established protocol by Yu and colleagues (9). Concentrated replication incompetent pseudotyped lentiviruses that expressed Oct4, Sox2, Nanog, or Lin28 were used to infect the cells. The hiPSCs that were used to derive endothelial cells were obtained from the James Thomson Lab (University of Wisconsin–Madison), and were originally derived from IMR90 fetal fibroblasts (ATCC) using the reprogramming factors Oct4, Sox2, Nanog, and Lin28 that were cloned within lentiviral vector. The hiPSCs used to derive neural crest cells were reprogrammed from fibroblasts obtained from Coriell using a lentiviral construct with 4 factors (Oct4, Sox2, Klf4 and c-Myc) as described recently (42).
Gene expression data were obtained from the Gene Expression Omnibus (GEO) repository, which is currently the largest fully public gene expression resource. The GEO (43) repository at the National Center for Biotechnology Information archives freely disseminates microarray and other forms of high-throughput data generated by the scientific community.
Microarray analysis
The hiPSC- and hESC-derived neural crest cells and endothelial cells were gene expression profiled with the Illumina human-6 v2.0 expression beadchip and Agilent 4 × 44K whole human genome microarray (G4112F) platform, respectively. All other gene expression data were obtained with the HG-U133plus2 microarray platform (Affymetrix). All data sets were analyzed by using GeneSpring GX 11.0 software (Agilent Technologies, Inc.). Gene-level signal estimates were derived from the raw data files. Summarization of gene expression data was done by implementing the robust multichip averaging algorithm, with subsequent baseline normalization of the log-summarized values for each probe set to that of the median log-summarized value for the same probe set in the control group. Expression data were then filtered to remove probe sets whose signal intensities for all the treatment groups were in the lowest 20 percentile of all intensity values. The data were then subjected to ANOVA, incorporating the Benjamini–Hochberg FDR multiple testing correction, with a significance level of P-value less than 0.05 to obtain the differentially expressed genes between different groups. Probe sets were further filtered on the basis of a fold-change cutoff of 2.0.
Distance measure
We have used the statistical software package SPSS (IBM) to generate the Euclidean distance matrix and the corresponding dendrogram to calculate the distances between different sets of cells with respect to cancer cells. To calculate the relative distances among hiPSC- and hESC-derivatives and their corresponding primary cells with cancer cells, we considered 1 to be the farthest distance (Euclidean distance) obtained between cancer and the differentiated cells. We calculated the distances of the hiPSC- and hESC-derivatives from cancer cell lines for each of the data sets. The gene expression “overlap percentage” between 2 groups of cells is the percentage of genes that have a similar expression pattern. Thus, 2 “closer” groups will have a higher percentage of similar genes and therefore will have higher “overlap percentage” and vice versa.
Ingenuity Pathway Analysis
To conduct functional annotation of the differentially expressed genes among different groups, we used Ingenuity Pathway Analysis (IPA) software. This software assigns biological functions to genes by using the Ingenuity Pathways Knowledge Base (Ingenuity Systems, Inc.). The knowledge base includes information about thousands of human, mouse, and rat genes (44). This information is used to form networks to create an “interactome” of genes that are involved in specific biological processes.
Functional analysis.
The functional analysis identified the biological functions and/or diseases that were most significant to the data set. Molecules from the data set that met the P value cutoff of 0.05 and fold-change cutoff of 2.0 were then associated with biological functions and/or diseases in Ingenuity's knowledge base. Right-tailed Fisher's exact test was used to calculate a P-value determining the probability that each biological function and/or disease assigned to that data set is due to chance alone.
Canonical pathway analysis.
Canonical pathways analysis identified the pathways from the IPA library of canonical pathways that were most significant to the data set. The significance of the association between the data set and the canonical pathway was measured in 2 ways: (i) A ratio of the number of molecules from the data set that map to the pathway divided by the total number of molecules that map to the canonical pathway is displayed. (ii) Fisher's exact test was used to calculate a P-value determining the probability that the association between the genes in the data set and the canonical pathway is explained by chance alone.
Cell cultures and RNA preparation
For the qRT-PCR data, hiPSCs were obtained from the James Thomson Lab (University of Wisconsin–Madison), which were originally derived from IMR90 fetal fibroblasts using reprogramming factors Oct4, Sox2, Nanog, and Lin28 (virally reprogrammed). The H9 hESC cell line was obtained from Wicell. The nonviral minicircle hiPSCs (mc-hiPSC) were originally derived from human adipose stem cells using a minicircle vector containing 4 reprogramming factors Oct4, Sox2, Nanog, and Lin28 (45). H9 hESCs, hiPSCs, and mc-hiPSCs were cultured on Matrigel in mTeSR1 medium (Stem Cell Technologies). Endothelial cell differentiation of hiPSCs, mc-hiPSCs, and hESCs was done as previously described (34). Briefly, the pluripotent cell colonies were detached by 1 mg/mL dispase and transferred to ultra low-attachment plates for embryoid body (EB) formation. 12-day-old EBs were harvested and then suspended in collagen I. The mixture was then incubated at 37°C for 30 minutes to allow gel polymerization. Later, EGM-2 medium (Lonza) plus 5% Knockout serum with 50 ng/mL VEGF and 20 ng/mL fibroblast growth factor 2 (FGF2) was added. After 3-day culture, the CD31+/CD144+ cells (representing endothelial cells) were purified by fluorescence-activated cell sorting. All 4 endothelial cell types (hiPSC-EC, mc-hiPSC–EC, hESC-EC, and HUVEC) were cultured under the same conditions. Cells were harvested at 80% confluency. Using the RNAeasy Mini Kit (Qiagen Inc.), RNA was isolated from biological duplicates of hiPSC-EC, mc-hiPSC–EC, hESC-EC, and HUVEC.
The cancer cell lines, namely 3 breast cancer cell lines (MDAMB-231, MDAMB-435, and MDAMB-468) and 1 prostate cancer cell line (PC-3), were obtained from Zhen Cheng's laboratory at Stanford. These cells were cultured in Dulbecco's modified Eagle's medium and 10% FBS. When the cell confluence reached 80%, the culture medium was changed to EC culture medium [EGM-2 medium (Lonza) plus 5% Knockout serum with 50 ng/mL VEGF and 20 ng/mL FGF2]. Cells were isolated after single passage. RNA was isolated from cell samples by using Qiagen RNAeasy Mini Kit (Qiagen Inc.). RNA was isolated from biological duplicates of MDAMB-231, MDAMB-435, MDAMB-468, and PC-3.
qRT-PCR
One microgram of total RNA from each cell sample was reversed transcribed with iScript cDNA synthesis KIT(Bio-Rad). For each sample, qRT-PCR was done in duplicate on a StepOnePlus Real-Time PCR System (Applied Biosystems) using Taqman primer probe sets (Applied Biosystems) for each gene of interest and an 18S control primer probe set for normalization. Gene expression assay IDs for the tumor-specific genes obtained from Applied Biosystems used in the amplification reaction are as follows: TNC: Hs01115665_m1, VCAN: Hs00171642_m1, PCOLCE: Hs00170179_m1, KIAA1199: Hs00378520_m1, FOS: Hs01119267_g1, SEMA5A: Hs01549381_m1, SNAI2: Hs00161904_m1, SERPINE2: Hs00299953_m1, COL6A2: Hs00242484_m1, THBS1: Hs00962908_m1, and 18S: Hs03928990_g1. Representative results are shown as fold-expression relative to HUVEC unless otherwise stated.
Western blot
Cells were collected in RIPA buffer (Sigma) and briefly sonicated to shear DNA and reduce sample viscosity. Protein concentration was measured by Bio-Rad Protein Assay Kit. Samples were run on a 10% Mini-PROTEAN TGXPrecast Gel (Bio-Rad) and transferred onto nitrocellulose membranes. After being blocked in 5% nonfat dry milk in PBS for 1 hour, the membranes were incubated with specific antibodies overnight at 4°C (TNC: Sigma-Aldrich HPA004823; SEMA5A: antibodies-online.com ABIN171679; PCOLCE: Abcam ab39204; COL6A2: Sigma-Aldrich HPA007029; glyceraldehyde-3-phosphate dehydrogenase (GAPDH): Abcam ab9484). After 3 washes in TPBS for 10 minutes each, the membranes were incubated in goat anti-mouse or goat anti-rabbit antibody conjugated with horseradish peroxidase for 1 hour followed by 2 washes in TPBS, and PBS for 5 minutes each, respectively. The signals were developed in ECL Chemiluminescence Kit (Amersham Biosciences).
Results
Global gene expression patterns reflect an oncogenic potential in derivatives from hESCs and hiPSCs
To conduct our analysis, we grouped the data into 3 sets that were based on cell lineage. The first group includes hiPSC-derived hepatocytes (hiPSC-HEP), hESC-derived hepatocytes (hESC-HEP), and primary hepatocytes (HEP). The second group consists of hiPSC-derived endothelial cells (hiPSC-EC), hESC-derived endothelial cells (hESC-EC), and HUVEC. The third group consists of hiPSC-derived neural crest cells (hiPSC-NCC), hESC-derived neural crest cells (hESC-NCC), and primary neural crest cells (NCC). Lastly, we included a cancer set consisting of gene expression data from 5 cancer cell lines isolated from 5 types of cancer. Supplementary Table S1 summarizes the details of the cancer cell lines.
To determine the oncogenic signature within the hiPSC- and hESC-derivatives, we first conducted a global gene expression analysis of the hiPSC-derived cells (hiPSC-HEP, hiPSC-EC, and hiPSC-NCC), hESC-derived cells (hESC-HEP, hESC-EC, and hESC-NCC), corresponding primary cells (HEP, HUVEC, and NCC), and the set of common cancer cell lines. The significant probe sets that remained after ANOVA analysis (P < 0.05 and fold-change ≥2.0) showed a maximum overlap of gene expression pattern (based on the percentage of genes with similar expression pattern between 2 groups of cells) between hiPSC-derivatives and cancer cell lines relative to those of hESC-derivatives and the primary cell lines (Fig. 1). The gene expression overlap percentages between cancer cell lines versus hiPSC-HEP, hiPSC-EC, and hiPSC-NCC are 69.4, 68.3, and 65.8, respectively. The gene expression overlap percentages between cancer cell lines versus hESC-HEP, hESC-EC, and hESC-NCC are 64.8, 65.7, and 60.0, respectively. For the corresponding primary cells, the percentage overlap is lowest at 62.6, 58.0, and 50.1 for HEP, EC, and NCC, respectively. An Euclidean distance measure and cluster analysis conducted on this global gene expression data confirmed the same pattern. Supplementary Figures S1, S2, and S3 show that hiPSC-derived cells are closest to the cancer cell lines in our analysis. The primary cells, regardless of lineage, lie at the furthest distance from cancer cell lines in our analysis.
Global gene expression pattern showing gene expression overlap (in percentage) of the following groups: (A) cancer versus hiPSC-HEP, hESC-HEP, and HEP; (B) cancer versus hiPSC-EC, hESC-EC, HUVEC; and (C) cancer versus hiPSC-NCC, hESC-NCC, and NCC. Gene expression overlap is highest between cancer cell lines and hiPSC derivatives compared with primary cells.
Global gene expression pattern showing gene expression overlap (in percentage) of the following groups: (A) cancer versus hiPSC-HEP, hESC-HEP, and HEP; (B) cancer versus hiPSC-EC, hESC-EC, HUVEC; and (C) cancer versus hiPSC-NCC, hESC-NCC, and NCC. Gene expression overlap is highest between cancer cell lines and hiPSC derivatives compared with primary cells.
Cancer-specific gene expression pattern in hESC and hiPSC derivatives
To look deeper into the oncogenic potential of the hiPSC- and hESC-derivatives, we conducted a detailed functional annotation of the differentially expressed genes for each of the 3 groups. We focused on cancer-related genes obtained from the functional analysis using IPA, and analyzed the expression patterns of these genes within the 3 sets of data. On carrying cluster analysis and distance measures, the distance matrix for the hepatocyte data showed that the distance between cancer and hiPSC-HEP is closest at 50.63, followed by hESC-HEP at 58.31, and farthest at 98.16 for primary hepatocytes (Fig. 2). For the endothelial data set, the corresponding distances are 39.12 (closest), 46.61, and 78.57 (farthest), respectively (Fig. 3). Similar observations are noted for the neural data set, the distances being 26.21 (closest), 29.50, and 52.75 (farthest; Fig. 4). Collectively, these results further confirmed the “oncogenic signature” that still remains within these derivatives. Figure 5 shows the relative distance measures of all the hiPSC- and hESC-derivatives from the cancer cells and with respect to the corresponding primary cell lines.
Cancer-specific gene expression analysis for cancer, hESC- and hiPSC-derived hepatocytes, and primary hepatocytes. A, matrix showing the distance measures among the 4 cell types. B, hierarchical cluster analysis of the 4 cell types.
Cancer-specific gene expression analysis for cancer, hESC- and hiPSC-derived hepatocytes, and primary hepatocytes. A, matrix showing the distance measures among the 4 cell types. B, hierarchical cluster analysis of the 4 cell types.
Cancer-specific gene expression analysis for cancer, hESC- and hiPSC-derived endothelial cells, and HUVEC. A, matrix showing the distance measures among the 4 cell types. B, hierarchical cluster analysis of the 4 cell types.
Cancer-specific gene expression analysis for cancer, hESC- and hiPSC-derived endothelial cells, and HUVEC. A, matrix showing the distance measures among the 4 cell types. B, hierarchical cluster analysis of the 4 cell types.
Cancer-specific gene expression analysis for cancer, hESC- and hiPSC-derived neural crest cells, and neural crest cells. A, matrix showing the distance measures among the 4 cell types. B, hierarchical cluster analysis of the 4 cell types.
Cancer-specific gene expression analysis for cancer, hESC- and hiPSC-derived neural crest cells, and neural crest cells. A, matrix showing the distance measures among the 4 cell types. B, hierarchical cluster analysis of the 4 cell types.
Relative distance measures between (A) cancer cells versus hESC- and hiPSC-derived hepatocytes and primary hepatocyte cells; (B) cancer cells versus hESC- and hiPSC-derived endothelial cells and HUVEC; and (C) cancer cells versus hESC- and hiPSC-derived neural crest cells and neural crest cells.
Relative distance measures between (A) cancer cells versus hESC- and hiPSC-derived hepatocytes and primary hepatocyte cells; (B) cancer cells versus hESC- and hiPSC-derived endothelial cells and HUVEC; and (C) cancer cells versus hESC- and hiPSC-derived neural crest cells and neural crest cells.
Expression pattern of a common set of cancer genes
We next constructed a Venn diagram (Fig. 6) with the cancer genes that are significantly expressed in each of the 3 sets to define a common set of cancer genes that is significantly expressed in all 3 groups. Figure 6 shows that there are 20 potential cancer genes that are each significantly expressed in all 3 groups of data. The expression fold-change of the common set of 20 cancer genes (from the microarray data) in hESC-EC, hiPSC-EC, and cancer compared with HUVEC are provided in Supplementary Figure S4. On the basis of a literature review, as well as our analysis of the microarray data of these genes in hiPSC-ECs, hESC-ECs, and the cancer set, we selected 10 genes that appeared to be important cancer genes and that exhibited similar expression across these groups. Supplementary Table S2 shows the detailed functional annotation of this set of cancer genes.
Venn diagram showing the common cancer genes from the 3 data sets. Note there are 20 cancer genes that are common among the 3 data sets.
Venn diagram showing the common cancer genes from the 3 data sets. Note there are 20 cancer genes that are common among the 3 data sets.
qRT-PCR and Western blot analysis
Because we routinely culture hiPSC- and hESC-derived endothelial cells in our laboratory, we chose to conduct qRT-PCR analysis of the selected cancer genes in hiPSC-EC, hESC-EC, and HUVEC. RNA was extracted from these cells at passage less than 5. By qRT-PCR, we noted similar gene expression levels of the selected genes in hiPSC-EC and hESC-EC that was distinct from HUVEC, suggesting that these derivatives do carry an oncogenic signature even after undergoing differentiation (Fig. 7A). The only variations we observed were in the expression of KIAA1199, which was upregulated in hiPSC-EC but not in hESC-EC (compared with HUVEC), and in the tumor suppressor THBS1 (46) which was downregulated in hESC-EC but not in hiPSC-EC. Note that FOS was not upregulated in hiPSC-EC, similar to the microarray results. To see if these patterns of gene expression in hESC- and hiPSC-derivatives changed during subsequent passages, we repeated the qRT-PCR on these cells at passage 5 and passage 11 (Supplementary Fig. S5) and observed no significant changes in gene expression compared with passage less than 5 (Fig. 7A).
A, qRT-PCR data analysis and validation of 10 selected common cancer genes in hiPSC-EC and hESC-EC relative to HUVEC at passage less than 5. B, Western blot analysis of TNC, SEMA5A, PCOLCE, and COL6A2 in hiPSC-EC and hESC-EC as compared with HUVEC confirms increased expression at protein level. C, qRT-PCR data of the selected common cancer genes in 4 cancer cell lines (treated with EC-medium, as well as untreated), hiPSC-EC, and hESC-EC relative to HUVEC. There was no significant change in the gene expression pattern of the cancer cell lines on treatment with EC-medium. D, qRT-PCR data of the selected common cancer genes in nonviral minicircle reprogrammed hiPSC-derived EC (mc-hiPSC–EC), lentiviral reprogrammed hiPSC-derived EC (hiPSC-EC), and hESC-EC relative to HUVEC.
A, qRT-PCR data analysis and validation of 10 selected common cancer genes in hiPSC-EC and hESC-EC relative to HUVEC at passage less than 5. B, Western blot analysis of TNC, SEMA5A, PCOLCE, and COL6A2 in hiPSC-EC and hESC-EC as compared with HUVEC confirms increased expression at protein level. C, qRT-PCR data of the selected common cancer genes in 4 cancer cell lines (treated with EC-medium, as well as untreated), hiPSC-EC, and hESC-EC relative to HUVEC. There was no significant change in the gene expression pattern of the cancer cell lines on treatment with EC-medium. D, qRT-PCR data of the selected common cancer genes in nonviral minicircle reprogrammed hiPSC-derived EC (mc-hiPSC–EC), lentiviral reprogrammed hiPSC-derived EC (hiPSC-EC), and hESC-EC relative to HUVEC.
We next conducted Western blots (Fig. 7B) to confirm the expression of 4 selected genes (TNC, SEMA5A, PCOLCE, and COL6A2) at the protein level. We observed significant upregulation of protein for these genes in hESC- and hiPSC-derivatives as compared with HUVEC. To evaluate if there was any effect of the media or differentiation factors on the oncogenic gene expression of hESC-EC and hiPSC-EC, we treated cancer cell lines (MDAMB-231, MDAMB-435, MDAMB-468, and PC-3) with the same media as used for hESC-EC and repeated the qRT-PCR analysis on the same set of cancer genes. We found no significant change in the cancer gene expression profile in cancer cells due to the presence of EC medium (Fig. 7C). Collectively, these results suggest that the expression of an oncogenic gene signature in hESC- and hiPSC-derivatives is not due to passage numbers or culturing conditions.
All of the hiPSC lines used in this study were derived using lentiviral transduction. To determine whether nonvirally reprogrammed hiPSC-derivatives also express an oncogenic signature, we conducted cancer gene qRT-PCR in endothelial cells that were differentiated from nonviral minicircle derived hiPSCs (mc-hiPSC–EC), and compared this expression pattern with hiPSC-EC, hESC-EC, and HUVEC. We found that the oncogenic gene expression pattern was more pronounced in endothelial cells derived from viral reprogrammed hiPSCs compared with mc-hiPSCs. However, significant expression of cancer genes remained in these mc-hiPSC–ECs compared with HUVEC (Fig. 7D). Hence, the nonviral reprogrammed mc-hiPSC–derived cells still express an oncogenic signature.
Discussion
Given the current state of stem cell medicine, the path to safe and effective hiPSC- and hESC-based regenerative therapies will be both challenging and lengthy. Rigorous basic and translational studies of the tumorigenic nature of pluripotent stem cells and their derivatives will be required before the safety of regenerative therapies can be ensured. Encouragingly, the results of hESC-based regenerative treatments in animal models by Geron and Advanced Cell Technology have led to Food and Drug Administration clearance for some of the first human clinical trials for acute spinal cord injury and Stargardt's macular dystrophy, respectively. Although the final results of these trials are not yet known, it is hoped that they will be both effective and safe, and that the transplanted cells will not lead to teratomas in the recipients.
In this work, we investigated whether pluripotent stem cell derivatives are indeed fully devoid of oncogenic potential. Our detailed bioinformatic analysis revealed that hiPSC- and hESC-derivatives still express an oncogenic signature that might present problems for future therapeutic usage. A series of qRT-PCR analyses further confirmed the upregulation of oncogenes, and downregulation of tumor suppressors, compared with the corresponding primary cells. Hence, we believe that future clinical trials should ideally use additional selection criteria (e.g., flow cytometry sorting) that will remove contaminating pluripotent cells from the therapeutic cell population prior to transplantation. The cell sorting can be either positive (e.g., sorting for the differentiated cell markers) or negative (e.g., sorting against embryonic cell markers). However, it is important to realize that differentiation is a dynamic process and not simply an “on/off” switch, and so there may be residual pluripotent cells within differentiated cultures even after flow sorting or other selection method is applied (47). Clearly, further studies must be conducted to assess the optimal method(s) for purifying these therapeutic cell populations.
The “optimal” reprogramming method for deriving hiPSCs from primary cells is still unknown in terms reprogramming efficiency and minimizing the use of oncogenic reprogramming factors such as c-myc. Recent work by Warren and colleagues described a simple, nonintegrating strategy for reprogramming via administration of synthetic mRNA modified to overcome innate antiviral responses, thus leading to integration-free iPSCs (16). This approach can reprogram multiple human cell types with reported efficiency of more than 2%, which are 2-orders of magnitude higher than those typically reported for virus-based derivations. Furthermore, a recent report shows that expression of the miR302/367 cluster rapidly and efficiently reprograms mouse and human somatic cells to an iPSC state without exogenous transcription factors (48).
Other methods of producing nonviral hiPSCs have also been reported (13, 45, 49). Of note, our study focused primarily on 3 hiPSC lines that were reprogrammed using lentivirus and known oncogenic reprogramming factors. To address the issue of viral versus nonviral reprogramming, we also investigated the oncogenic property of cells derived from nonvirally reprogrammed hiPSCs using minicircle vectors. Interestingly, it seems that derivatives of integration-free mc-hiPSCs express an oncogenic signature that is similar, albeit slightly diminished, to lentiviral-derived hiPSCs.
Another study by Nakagawa and colleagues reported the use of another Myc family member, L-Myc, and c-Myc mutants (W136E and dN2), all of which have little transformation activity and thus may promote hiPSC generation without the proto-oncogene c-Myc. Their analysis showed that the selection of nononcogenic reprogramming factors reduced the tumorigenic potential of the resulting hiPSCs (50). Another more recent study by Anokye–Danso and colleagues showed that microRNA-302 and micro-367 can be used to efficiently mediate reprogramming of mouse and human somatic cells, without the usage of standard Oct4/Sox2/Klf4/Myc transcription factors (48). Clearly, these reprogramming advances, in conjunction with ongoing efforts to reduce residual oncogenic gene expression in stem cell derivatives, will need to be further investigated in the coming years.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Grant Support
This work was supported by NIH New Innovator Award DP2OD004437, NIH AG036142, and NIH AI085575 (J.C. Wu).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.