Abstract
Elucidating the mechanisms behind how exposure to environmental chemicals can lead to cancer is not easy due to the complex natures of these compounds and the challenges to establish biologically relevant experimental models to study them. Environmental chemicals often present selective mechanisms of action on different cell types and can be involved in the modulation of targeted cells and their microenvironment, including immune cells. Currently, the limitations of traditional epidemiologic correlation analyses, in vitro cell-based assays, and animal models are that they are unable to comprehensively examine cellular heterogeneity and the tissue-selective influences. To this end, we propose utilizing single-cell RNA-sequencing (scRNA-seq) to more effectively capture the subtle and complex effects of environmental chemicals and how their exposure could lead to cancer. scRNA-seq's capabilities for studying gene expression level data at a significantly higher resolution relative to bulk RNA-sequencing (RNA-seq) enable studies to evaluate how environmental chemicals regulate gene transcription on different cell types as well as how these compounds impact signaling pathways and interactions between cells in the tissue microenvironment. These studies will be valuable for evaluating environmental chemicals' carcinogenic properties at the individual cell level.
See all articles in this CEBP Focus section, “Environmental Carcinogenesis: Pathways to Prevention.”
Introduction
In the field of environmental carcinogenesis, epidemiologic studies are powerful ways to correlate chemical exposures to the development of cancer in human. However, epidemiologic approaches alone are insufficient to elucidate the causes of how certain chemical exposures drive cancer development, leading researchers to rely on cell-based reporter assays and animal models to assess general mechanistic changes caused by exposure to these chemicals. Despite this, limitations still exist with cell-based assays and animal models; cell-based assays are typically developed with known targets in mind and animal models will provide useful information only when definitive phenotypic changes are observed. Consequently, these experimental models for environmental risk assessments may not be able to provide adequate support to enact policy changes to regulate human exposure (1). Thus, an active collaboration among scientists involved in epidemiologic, in vitro, and in vivo studies is critical to advance our knowledge in the field of environmental carcinogenesis. An excellent example of one such collaboration is the NIH-funded Breast Cancer and the Environment Research Program (2).
So far, most experimental conditions associated with the mechanistic studies of environmental carcinogenesis do not factor in elements such as the microenvironment and cellular heterogeneity, both of which play a role in carcinogenesis. Here, we propose using single-cell transcriptomics to address these issues. Single-cell RNA-sequencing (scRNA-seq) provides high resolution gene expression information on the individual cellular level, allowing highly heterogeneous cell populations or complex cellular networks (e.g., mammary gland and brain tissue) to be studied. In contrast with bulk RNA-sequencing (RNA-seq), which provides gene expression information albeit as a mixture, scRNA-seq preserves cellular heterogeneity, which can further elucidate the oftentimes subtle effects of environmental chemicals (Fig. 1). We hope this minireview establishes the value of scRNA-seq as an important tool to comprehensively study the effects of environmental chemicals on cancer risk.
Overview of Single-Cell RNA-seq
The origin of single-cell transcriptomics is attributed to Eberwine and colleagues in 1992, who first examined single cells isolated from the rat hippocampus (3, 4). Since then, the development of platforms such as 10X Genomics and Fluidigm has popularized single-cell transcriptomics in a wide variety of fields. SMART-seq, on the Fluidigm C1 system, allows for full-length cDNA coverage, but has a significant limitation on the number of cells that can be sequenced. On the other hand, 10X Genomics' Chromium is capable of sampling many cells, but only allows for 5′/3′ counting, and generally detects fewer genes per cell (5). After sequencing, downstream processing of scRNA-seq data is often carried out using R packages such as Seurat (6, 7), or with software such as Loupe from 10X Genomics and SeqGeq from FlowJo.
Beyond differentially expressed genes analyses and cell clustering, additional tools have also been developed to analyze scRNA-seq datasets even further. Due in part to the high-resolution capabilities of single-cell sequencing, there is growing interest in studying cells that are in between two defined “states,” such as epithelial cells during the mammary gland developmental process (8), or leukocyte activation (9). Analytic tools such as Monocle (10) and SCENT (11) were developed to investigate these cellular “trajectories” (12). The capability for scRNA-seq to capture high-resolution transcriptional data also allows for profiling of cell populations by utilizing differentially expressed genes as signatures for gene ontology scoring. This method opens possibilities into identifying subpopulations based on function rather than comparison with gene lists in literature (13, 14). Another emerging method used to analyze single-cell data is through mapping ligand–receptor relationships at the cellular level. An early large-scale map was first proposed by Ramilowski and colleagues in 2016 (15), with its implementation into single-cell datasets developed as SoptSC by Wang and colleagues (16). Single-cell transcriptomics is a rapidly expanding field with new applications as users find innovative methods to analyze single-cell datasets. However, single-cell transcriptomic analysis is currently not fully utilized in the field of environmental carcinogenesis.
Challenges Facing Environmental Chemical Carcinogenesis Studies
Evaluating how exposure to certain environmental chemicals can lead to carcinogenesis remains a challenge today. These obstacles are present due to the fact that these chemicals, by nature, do not act in a singular, specific manner, but rather have multiple biological targets, leading to different effects (17). Furthermore, conventional in vivo and in vitro approaches require predefined mechanisms or pathways to study, and epidemiologic studies are not mechanism-driven studies.
Current approaches to elucidate the complex nature of environmental chemicals have significant limitations. In vitro cell line work, by nature, restricts experiments to be conducted on only one or only a few cell types at a time, and does not account for heterogeneity within the same cell type. Furthermore, many environmental chemicals often do not present or induce readily observable phenotypic changes early during the exposure period, making in vivo animal models difficult to utilize without dramatically observable endpoints. To this end, we believe that scRNA-seq can fill the knowledge gap in environmental chemical studies by providing the capability to study subtle changes associated with the tissue microenvironment and cellular heterogeneity, as well as potential cell-to-cell cross-talk mechanisms, at early point of exposure.
Evaluation of Tissue Microenvironment and Cellular Heterogeneity with scRNA-seq
The microenvironment plays a crucial role in maintaining homeostasis (18); alterations to the microenvironment from environmental chemical exposures could potentially allow for unchecked cellular growth, and ultimately, cancer. These mechanisms are often facilitated through a complex network of signaling among fibroblasts, immune cells, and stem cells. Matrix metalloproteinases, for instance, are a class of enzymes highly involved in extracellular matrix (ECM) remodeling; the dysregulation or overexpression of these enzymes can lead to excessive ECM degradation, priming the microenvironment for tumorigenesis (19). Furthermore, chemokines found in the microenvironment have also been shown to play a role in regulating stem-like properties in cancer cells and cancer invasiveness, alongside being regulators for local immune cell recruitment (20).
Recent emphasis on the tissue microenvironment as a key player in carcinogenesis has led to the use of scRNA-seq to study subtle transcriptional changes in the tumor microenvironment for a variety of cancers in humans, such as breast cancer. Chung and colleagues examined the cellular profiles of breast carcinoma samples from 11 patients across the four breast cancer subtypes: luminal A, luminal B, HER2, and triple-negative breast cancer (TNBC; ref. 21). Across these samples, 175 cells were identified as immune cells, which were then clustered into three groups: B cells, T cells, and macrophages. Through scRNA-seq, T cells from the luminal B subtype showed an early, naïve expression signature, while T cells from a TNBC sample showed an expression signature indicative of exhaustion. T cells with an exhaustion expression signature are associated with a loss in effector functions and an increase in inhibitory receptors such as PD-1 (22), suggesting that T-cell infiltration of different breast cancer subtypes may have different responses. These subtle expression-level changes would not have been elucidated through traditional bulk RNA-seq methods. In 2017, Puram and colleagues examined roughly 6,000 single cells from patients with head and neck squamous cell carcinoma, and were able to classify cells into groups such as macrophages, endothelial cells, and fibroblasts (18). More importantly, they discovered that in their sequenced cells, there was still cross-talk between malignant and nonmalignant cells, indicating the importance of the microenvironment in carcinogenesis (18). Recently, Azizi and colleagues conducted a large-scale immune cell profiling from samples taken from breast carcinomas and normal breast tissue (9). After sampling 45,000 immune cells, they found an expansion of immune cell diversity in the breast carcinoma samples compared with normal breast tissue. These results suggest that the pathways to carcinogenesis involve a remodeling of the immune cell populations already present. Capturing the phenotypic changes in infiltrating immune cell populations during carcinogenesis may provide a mechanistic understanding of how these cancers emerge. The resolution at which scRNA-seq is able to capture perturbations to the immune cell profiles among individuals makes it a valuable tool to the growing field of immunoepidemiology, potentially supplementing other methods for quantifying immune-cell populations, such as through analyzing differentially methylated DNA regions (23). There have also been efforts to use scRNA-seq to generate high-resolution reference immune-cell profiles to estimate leukocyte populations from bulk RNA-seq data (24). This approach opens opportunities to explore how environmental chemicals' perturbations can affect immune cell profiles over a wide population.
Great strides have already been made to capture cellular heterogeneity using scRNA-seq technology as well. Some have postulated that subtle cellular heterogeneity is the source of tumor resistance to cancer therapies, further highlighting the need for a high-resolution methodology to study these differences (25). Even cells of the same classification, such as epithelial cells or fibroblasts, can show cell-to-cell variation based on subtle differences on the single-cell transcriptomic level. scRNA-seq conducted on mesenchymal cells from mice harboring the MMTV-PyMT tumor showed heterogenic expression of common cancer-associated fibroblast markers such as Fap, Pdgfra, and Vim (13). Additional enrichment analysis using gene ontology terms further classified the fibroblasts into vascular, matrix-related, cycling, or developmental fibroblasts. For instance, developmental fibroblasts, in contrast with the other three classifications, showed a higher expression of stem cell–related genes such as Scrg1, Sox9, and Sox10, leading Bartoschek and colleagues to conclude that these were potentially cells originating from tumor cells that have undergone an epithelial-to-mesenchymal transition (13). Others have also argued for abandoning strict classification of cells altogether, suggesting that cells often follow a differentiation trajectory and therefore represent a spectrum of gene signatures that cannot be compartmentalized into subgroups, a consequence of being able to study cellular heterogeneity at a high resolution. For instance, Bach and colleagues isolated epithelial cells from the mammary gland of mice at four different developmental stages, and reconstructed their data using pseudotime to trace the trajectory of differentiation of their progenitor luminal cells to more differentiated states (8).
scRNA-seq Applications in Environmental Chemical Exposure Studies
Most of the current literature on environmental chemicals using scRNA-seq technology has been exploratory studies on general mechanisms of actions, such as the chemical's ability to induce inflammation in the lung or capability to affect fetal development (26, 27). More specifically, Bhetraratana and colleagues examined the ability of diesel exhaust particles to induce heterogenic macrophage responses in the lungs of mice. After conducting pathway analysis of their data, they discovered that the innate immune system pathways were primarily dysregulated in alveolar macrophages, and saw the presence of oxidative stress in peritoneal macrophages. They proposed that these impacted pathways are signs that these diesel exhaust particles can induce the release of inflammatory mediators, eventually inducing inflammation in the lungs (27). Cross-talk from the tissue microenvironment was the central focus of the study on human embryonic stem cells, as the effects on nicotine exposure on overall cell-to-cell communication was examined. Adapting a previously established ligand–receptor framework to analyze their single-cell data (15), Guo and colleagues showed that nicotine caused an overall increase in intercellular signaling among human embryonic cells, and identified the activation of the HMGB1-TLR4 pathway as a key upregulated pathway induced by nicotine exposure (26). This pathway, in turn was associated with an increased incidence of tumor metastases (28).
In 2000, polybrominated diphenyl ethers (PBDE), a class of widely prevalent fire retardants, was found to accumulate in human tissues and suggested to be an emerging environmental challenge (29). The exposure and mode of mechanisms have been extensively studied during the past 20 years. Our laboratory has confirmed the estrogen receptor- and hydrocarbon receptor–mediated mechanisms of three PBDEs (BDE47, BDE100, and BDE153) in human breast cancer cells and patient-derived xenografts (14). With significant literature detailing the importance of the tissue microenvironment in carcinogenesis, as well as the importance of recognizing cellular heterogeneity of mammary glands, we examined how these PBDEs could remodel the infiltrating immune-cell populations in the mammary gland, as well as which specific cell types were impacted by PBDE exposure. In brief, we sequenced 14,856 cells from entire mouse mammary glands after the mice were treated with either vehicle treatment, estradiol (E2), or E2 and PBDEs in combination. The ability of scRNA-seq to capture high-resolution expression data led us to propose two mechanisms in which either E2 or E2 in combination with PBDEs could lead to M2 macrophage polarization in the mammary gland, which in turn, has been reported to have protumoral functions. One such pathway is through E2-mediated upregulation of Ccl2 in Esr1+ fibroblasts, which can recruit M2 macrophages to the mammary gland. The other potential mechanism is through the PBDE-induced E2-mediated expression of Areg in Esr1+/Pgr+ cells in the mammary gland epithelium, which in turn interacts with Egfr+ fibroblasts to induce M2 macrophage polarization in the tissue microenvironment (Fig. 2; ref. 30). The microenvironment, consisting of surrounding fibroblasts, immune cells, and other neighboring epithelial cells, plays a role in the development of epithelial organs such as the mammary gland (31). Our recent study serves as a good example how scRNA-seq analysis can provide comprehensive information regarding how environmental chemicals such as PBDEs directly impact and potentially reshape these components of the mammary gland microenvironment, and can lead to a better understanding of how these chemicals can induce carcinogenesis following surgical menopause.
For our own PBDE study, we also observed heterogeneity in our luminal epithelial and fibroblast populations from sequenced mammary gland cells. Esr1 and Pgr expression induced by E2 and PBDE treatment were not always simultaneously present in the same cell, suggesting that the induction of Pgr could occur independently of Esr1 status in the epithelial cell (30). Furthermore, there was a subpopulation of fibroblasts expressing Esr1 and Ccl2, reflecting the heterogeneity within our detected fibroblast populations (30). Resolving this cellular heterogeneity at the single-cell level is crucial as not all epithelial cells are involved in the E2–AREG–EGFR–M2 pathway, just as how not all fibroblasts facilitate the E2–CCL2–M2 macrophage pathway. These subpopulations would not have been revealed using traditional bulk RNA-seq methods.
Current Limitations of Single-Cell Transcriptomics
Despite the significant advances and discoveries already made using scRNA-seq technology, several major limitations still exist for this experimental platform. As Eberwine and colleagues highlights in their 2014 commentary on the technology, 22 years after his initial work on neurons, major issues include the lack of a standardized single-cell isolation method and the loss of cellular spatial information in scRNA-seq (32). The need to optimize cell isolation protocols for single-cell transcriptomics stems primarily from the intrinsic variation in tissue composition. Isolation of myeloma cells for instance (33), differs significantly from epithelial tissue digestion (30, 34), which involves harsher digestion enzymes such as collagenase and has extended periods of mechanical digestion. The lack of standardized methods for single-cell isolation can lead to variability between replicates, as well as altered transcriptional states (32), making it difficult to compare scRNA-seq results from different experiments. Subtle differences between cell subpopulations may be masked, or noise resulting from technical variation may be misinterpreted as false positives. On the other spectrum, dropout is a common occurrence in scRNA-seq, where sequencing sensitivity limitations occasionally cause certain genes to have no transcript reads. To account for these issues, some have resorted to computational denoising methods of large scale scRNA-seq data (35), while others have utilized network-based methods for identifying clusters or cell subpopulations instead of relying on only a few established genetic markers (36).
In addition to technical noise and dropouts resulting from the cell isolation process, another significant source of error contaminating scRNA-seq results stem from the presence of “doublets” in the sample preparation. These are cell clumps that arise from incomplete tissue digestion, resulting in multiple cells being captured together and improperly sequenced as one cell. Doublets may lead to incorrect identification of supposed “rare” clusters that show gene expression signatures from two distinct populations. While there may be true rare cell populations identified through scRNA-seq, it becomes imperative to validate the presence of these rare clusters through other methods to eliminate the possibility that doublets are causing these clusters to emerge. Computational methods such as the DoubletFinder package in R have been developed to screen for and remove these doublets in scRNA-seq data (37). An additional potential issue from the sample preparation process is dissociation-induced expression level changes to the sample itself. Van den Brink and colleagues discovered that when disassociating muscle stem cells from mice, cells that underwent tissue dissociation had higher expression levels of Fos and Socs3 when compared with cells that did not undergo any dissociation procedure (38).
The intrinsic limitation of scRNA-seq techniques is that spatial information of the cells being sequenced is lost during processing. Contrasted with histologic methods such as IHC or immunofluorescence staining, which preserves the location of each cell being analyzed, cells subjected to scRNA-seq analysis can only be compared with each other on the basis of gene expression signatures. Therefore, classification of cells using scRNA-seq could only be done based on well-known markers on the transcript level. Zhu and colleagues have attempted to resolve this technical limitation by combining scRNA-seq with sequential FISH, to reconstruct a spatial map of the cells' original location relative to each other (39). Another way to resolve this limitation was done by Stahl and colleagues through spatial transcriptomics, where tissue digestion is done directly on histologic slides and mRNA captured while maintaining positional information (40).
scRNA-seq data will be only useful when generated from good single-cell isolation and high-quality RNA preparation. It would be challenging for studies involving not-freshly-prepared and long-term stored specimens, typically those collected for epidemiologic studies. Often it is difficult to obtain fresh tissue samples from humans that have good cell viability and dissociate well into single-cell suspensions. In circumstances where the human tissue is prone to transcriptomic alterations during dissociation, or where the cell viability falls too low during the digestion process, single-nuclei RNA-seq is a potential alternative to whole-cell scRNA-seq (41). However, information from cytoplasmic mRNA may be lost, which could be crucial information to define cell function (42). Also, within the field of single-cell analyses are single-cell nuclear DNA and DNA methylation analyses, which have their uses in studying genomic and epigenetic differences at the individual cellular level, respectively (43, 44). Beyond transcriptomic analyses offered by scRNA-seq, other single-cell technologies are emerging to investigate cellular characteristics such as copy number variations, clonal differences, or posttranslational modifications. In light of the limitations of scRNA-seq, it is clear that while scRNA-seq can provide valuable transcriptomic information at the cellular level, other experimental methods must be used to validate findings.
Discussion and Conclusions
Current literature on environmental carcinogenesis studies utilizing scRNA-seq is lacking, despite significant advances elsewhere in the fields of tumor and developmental biology. Out of the few studies that have utilized scRNA-seq technology examining environmental chemicals, our work on PBDEs was the only one to propose a mechanism of environmental carcinogenesis, with our finding that a potential E2–AREG–EGFR–M2 pathway may be responsible for PBDE-induced ductal regrowth in the mouse mammary gland. This unbiased, exploratory capability of scRNA-seq to elucidate novel mechanisms of action makes it an invaluable tool for environmental chemical carcinogenesis studies. Furthermore, scRNA-seq has also made it possible to elucidate subtle differences between subpopulations of cells, at early point of exposure to environmental chemicals.
While scRNA-seq can reveal previously unknown biological pathways affected by environmental chemicals or characterize subpopulations of cells, this method alone still has several technical limitations, caused by sample quality, data noise, dropout, and data contamination with doublets. Therefore, single-cell transcriptomics is best used with existing in vitro, in vivo, and epidemiologic studies as validation experiments (Fig. 3). scRNA-seq, as noted in this review, is an excellent tool for hypothesis generation due to its nonbiased, exploratory nature, but loses crucial information such as cellular spatial relationships in the process. Despite these limitations, the complex nature of environmental chemicals still makes scRNA-seq a valuable resource to utilize in this field. Considering the discoveries made already in the fields of tumor biology and developmental biology, as well as the progress made in elucidating mechanisms of action in several environmental chemical studies, there is significant potential in utilizing scRNA-seq technology to study environmental chemical carcinogenesis.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Acknowledgments
Financial support was provided in part to the corresponding author (S. Chen) by a NIH award U01ES026137. S. Chen is also a member of the City of Hope Cancer Center grant (NIH P30CA033572).