Abstract
The extent to which early events shape tumor evolution is largely uncharacterized, even though a better understanding of these early events may help identify key vulnerabilities in advanced tumors. Here, using genetically defined mouse models of small cell lung cancer (SCLC), we uncovered distinct metastatic programs attributable to the cell type of origin. In one model, tumors gain metastatic ability through amplification of the transcription factor NFIB and a widespread increase in chromatin accessibility, whereas in the other model, tumors become metastatic in the absence of NFIB-driven chromatin alterations. Gene-expression and chromatin accessibility analyses identify distinct mechanisms as well as markers predictive of metastatic progression in both groups. Underlying the difference between the two programs was the cell type of origin of the tumors, with NFIB-independent metastases arising from mature neuroendocrine cells. Our findings underscore the importance of the identity of cell type of origin in influencing tumor evolution and metastatic mechanisms.
Significance: We show that SCLC can arise from different cell types of origin, which profoundly influences the eventual genetic and epigenetic changes that enable metastatic progression. Understanding intertumoral heterogeneity in SCLC, and across cancer types, may illuminate mechanisms of tumor progression and uncover how the cell type of origin affects tumor evolution. Cancer Discov; 8(10); 1316–31. ©2018 AACR.
See related commentary by Pozo et al., p. 1216.
This article is highlighted in the In This Issue feature, p. 1195
Introduction
Most patients with cancer die from metastatic disease; however, many key aspects of metastatic progression remain poorly understood. In particular, the nature of the changes that drive metastasis and the potential impact of early events in tumorigenesis upon the direction of this evolution are largely unexplored. A better understanding of these mechanisms could help diagnose and treat patients more effectively (1, 2).
Small cell lung cancer (SCLC) is one of the most metastatic and lethal of all major cancer types (3, 4). SCLC is thought to acquire metastatic ability early in the course of tumor progression, but emerging evidence suggests that this metastatic ability is not inherent. Rather, genetic events such as upregulation of CXCR4 and NEUROD1 and/or amplification of NFIB are critical for SCLC invasion and metastasis in at least a subset of patients (5–9). Recent data from genetically engineered mouse models and human tumors have uncovered multiple levels of heterogeneity in SCLC; however, how this heterogeneity pertains to metastatic progression is largely unexplored (10–19).
SCLC is a neuroendocrine cancer; thus, it made sense when deletion of the key SCLC tumor suppressor genes Rb1 and Trp53 in pulmonary neuroendocrine cells uncovered these cells as one cell type of origin for SCLC in mouse models (refs. 20–22; reviewed in ref 4). Interestingly, although pulmonary neuroendocrine cells are quite rare, induction of these same genetic alterations in the much more prevalent lung epithelial cell types expressing either Scgb1a1 (coding for CC10, a marker of club cells) or Sftpc (coding for SPC, a marker of alveolar type II cells) demonstrated that these cells have very little, if any, ability to serve as the cell of origin of SCLC (20, 21). Nonetheless, the lung epithelium contains many diverse cell types (23–25), and whether SCLC can be initiated from other cell types is unknown.
Here, through detailed molecular characterization of primary tumors and metastases, we identify two discrete paths by which SCLC gains metastatic ability. Our data indicate that the same genomic alterations in different cell types give rise to distinct subtypes of SCLC, and that the founding cell type of origin can define the trajectory of tumor progression.
Results
Mouse SCLC Initiated from Adult Neuroendocrine Cells Gains Metastatic Ability without Upregulation of NFIB
To study the mechanisms underlying SCLC metastasis, we initially used the well-characterized Rb1flox/flox;Trp53flox/flox; p130flox/flox;R26mTmG (TKO;mTmG) mouse model, in which we initiated tumors via intratracheal delivery of an adenoviral vector expressing Cre recombinase under the control of the broadly expressed CMV promoter (Ad-CMV-Cre; refs. 7, 26). Thus, in this model, Trp53, Rb1, and p130 are inactivated in many different cell types in the lung. We initially uncovered that in this “CMV TKO” model, amplification of the Nfib genomic locus and high expression of the NFIB transcription factor in primary tumors is an important step during metastatic progression (7). To further investigate SCLC metastatic progression in a model in which the tumors arise from a defined cell type, we subsequently initiated tumors in TKO;mTmG mice with Ad-CGRP-Cre, which specifically directs Cre expression to mature, CGRP-expressing neuroendocrine cells (refs. 21, 22, 27; Fig. 1A and B). These “CGRP TKO” mice developed many fewer SCLCs than CMV TKO mice even when transduced with a 10- to 20-fold higher titer of Ad-CGRP-Cre (Supplementary Fig. S1A). Nonetheless, both CMV TKO and CGRP TKO mice developed SCLC and widespread metastatic disease with metastasis to multiple organs, including the lymph nodes and liver, 6 to 9 months after tumor initiation (Fig. 1A and D; Supplementary Fig. S1B–S1H). Previous studies showed that Ad-CMV-Cre and Ad-CGRP-Cre each can also initiate metastatic SCLC in Rb1flox/flox;Trp53flox/flox mice, but again, 15-fold more Ad-CGRP-Cre was used to initiate tumors (21, 28).
As part of our characterization of primary tumors and metastases in CMV TKO and CGRP TKO mice, we performed immunostaining for the prometastatic transcription factor NFIB. Surprisingly, in contrast to CMV TKO metastases, NFIB was undetectable in the vast majority of CGRP TKO metastases (Fig. 1C–E; Supplementary Fig. S1C–S1H). In general, primary tumors in CMV TKO mice were characterized by a “rosette” growth pattern with more differentiated glandular structures intermixed with less well-differentiated areas, whereas primary tumors in CGRP TKO mice were predominantly characterized by a “solid-nested” growth pattern. Notably, metastases in both models uniformly exhibited solid-nested growth and no rosette formation. Primary tumors with rosette histology were uniformly NFIBnegative/low in both models (Supplementary Table S1). Most of the solid-nested areas in CMV TKO mice were NFIBhigh, whereas the solid-nested areas in CGRP TKO were mostly NFIBnegative/low (Supplementary Fig. S1G–S1H). The majority of the macrometastases in the lymph nodes and liver, as well as most of the cancer cells growing within the pulmonary lymphatics, were NFIBhigh in CMV TKO mice. In contrast, most metastases and lymphatic invasive cells in CGRP TKO mice were NFIBnegative/low (Fig. 1C–E; Supplementary Fig. S1E–S1F). Similarly, across multiple other genotypes (including Rb1flox/flox;Trp53flox/flox Rb1flox/flox;Trp53flox/flox;Ptenflox/+, and Rb1flox/flox;Trp53flox/flox;Ptenflox/floxmice), SCLC tumors initiated by Ad-CMV-Cre generally gave rise to NFIBhigh metastases, whereas SCLC tumors initiated by Ad-CGRP-Cre generally gave rise to NFIBnegative/low metastases (refs. 5, 29; Supplementary Fig. S2A). Importantly, the expression of NFIB is also heterogeneous in human SCLC lymph node and brain metastases, suggesting that diverse mechanisms of tumor progression also exist in human SCLC (Fig. 1F and G; Supplementary Fig. S2D–S2E; refs. 6, 7, 30). Collectively, these data indicate that SCLC initiated from CGRPpositive cells in the mouse lung can metastasize in the absence of NFIB upregulation, and this may recapitulate tumor progression in a subset of patients with SCLC.
Murine SCLC Requires Progression Prior to Dissemination
In the CMV TKO model, genomic amplification of Nfib precedes dissemination, consistent with tumor evolution as a prerequisite for metastasis (5–7). The absence of NFIB upregulation in most tumors in CGRP TKO mice prompted us to investigate whether SCLC in this model also progresses to gain metastatic ability or if these tumors were inherently metastatic. Even at a late time point (7–11 months after tumor initiation), when they harbored multiple large primary tumors, not every CMV TKO or CGRP TKO mouse had detectable macrometastases, micrometastases, or even disseminated tumor cells (DTC; Fig. 2A). These initial data suggested that not all CMV-Cre or CGRP-Cre TKO tumors have the ability to disseminate and metastasize.
To further investigate whether only some tumors possess metastatic ability, we incorporated a multicolor reporter allele (R26LSL-Motley; ref. 31) into the Rb1flox/flox;Trp53flox/flox; p130flox/flox TKO model. In these TKO;Motley mice, clonally derived tumors are labeled with different combinations of fluorescent markers (Supplementary Fig. S3A). Importantly, in individual TKO;Motley mice with Ad-CMV-Cre– or Ad-CGRP-Cre–initiated SCLC, the metastases were almost always of one color, suggesting that they originated from a single primary tumor. Not only were metastases clonal in origin, but DTCs in the pleural cavity were also almost always of one color (Fig. 2B–D; Supplementary Fig. S3B). Thus, our data are consistent with not all tumors in CGRP TKO mice containing cells with the ability to readily disseminate and metastasize.
Although NFIB levels are low in the CGRP TKO model, we considered whether NFIB might be transiently upregulated during metastatic dissemination, then downregulated to generate NFIBnegative/low metastases. To investigate this possibility, we examined NFIB expression by immunofluorescence staining of FACS-isolated SCLC cells. DTCs and cancer cells from metastases from CGRP TKO mice had low NFIB expression, whereas DTCs and metastatic cancer cells in the CMV TKO model nearly always had high NFIB expression (Fig. 2E and F). This argues against transient expression of NFIB during metastasis in the CGRP TKO model. Collectively, these data indicate that metastatic ability is an acquired property of cancer cells in both models of SCLC, and that SCLC can take different paths during metastatic progression (Supplementary Fig. S3C).
Absence of Widespread Chromatin Changes during Metastatic Progression of SCLC in CGRP TKO Mice
During metastatic progression of CMV TKO tumors, upregulation of NFIB drives a dramatic global increase in chromatin accessibility (7). Metastasis in CGRP-Cre mice could be driven either by widespread chromatin accessibility changes that are independent of NFIB upregulation or by mechanisms that are independent of changes in chromatin accessibility. To characterize the changes in chromatin accessibility during metastasis, we performed Assay for Transposase-Accessible Chromatin with high-throughput sequencing (ATAC-seq) on primary tumors and metastases from CGRP TKO mice (Supplementary Figs. S4–S5; Supplementary Table S2 and Methods; ref. 32). In contrast to our previous analysis of CMV TKO tumors and metastases (7), unsupervised hierarchical clustering based on chromatin accessibility did not clearly separate the primary tumors and liver metastases from CGRP TKO mice (Fig. 3A). Comparison of the accessible chromatin regions in CGRP-Cre primary tumors and liver metastases uncovered only 1,020 genomic regions that were less accessible in metastases and 130 genomic regions that were more accessible in metastases (Fig. 3B; Supplementary Fig. S6A–S6C). Hierarchical clustering of chromatin accessibility of samples from both the CMV TKO and CGRP TKO models showed that CGRP TKO primary tumors, CGRP TKO liver metastases, and NFIBlow CMV TKO primary tumors clustered together but were separate from the NFIBhigh CMV TKO samples (Fig. 3C and D). Direct comparison between CGRP TKO and NFIBlow CMV TKO primary tumors uncovered very few regions with differential chromatin accessibility (260 regions more open in CMV TKO samples and 160 regions more open in CGRP TKO samples; Fig. 3E). Collectively, these data indicate that the chromatin state alterations during metastatic progression of SCLC in CGRP TKO mice are categorically different from those in CMV TKO mice, not just by virtue of NFIB expression, but also at the level of global differences in chromatin accessibility.
SCLC Subtypes Have Different Gene-Expression Programs
The overall similarity of the chromatin landscape in CMV TKO and CGRP TKO primary tumors suggests that inherent differences in chromatin accessibility likely do not explain the different trajectories of tumor progression in these models. To further understand the basis for the different metastatic paths of SCLC in CMV TKO versus CGRP TKO mice, we next performed RNA sequencing (RNA-seq) on FACS-isolated cancer cells from both models. Unsupervised clustering and principal component analysis (PCA) showed a clear separation of CMV TKO primary tumors from CMV TKO liver metastases. CGRP TKO primary tumors and metastases clustered together and more closely with CMV TKO primary tumors (Fig. 4A; Supplementary Fig. S7A).
Some metastatic primary tumors in the CMV TKO model express high levels of Nfib (Supplementary Fig. S7B and ref. 7), and Nfib expression correlated with the clustering of CMV TKO primary tumors and metastases (Supplementary Fig. S7C). To uncover gene-expression changes during metastatic progression of CMV TKO tumors, we thus performed differential gene-expression analysis between NFIBlow CMV TKO primary tumors and CMV TKO metastases. This analysis revealed widespread changes in gene-expression programs during metastatic progression of this subtype of SCLC (Fig. 4B; Supplementary Table S3). Genes that were upregulated in CMV TKO metastases were enriched for gene ontology annotations associated with neuronal differentiation and cell cycle, consistent with what we and others have observed upon ectopic expression of NFIB in SCLC cells (refs. 6, 7; Fig. 4C; Supplementary Fig. S8 and Supplementary Table S4).
Surprisingly, very few genes were significantly differentially expressed by more than 2-fold between CGRP TKO primary tumors and metastases (Fig. 4D). As expected from our immunostaining analysis, Nfib expression was not increased in CGRP TKO metastases. In contrast, more than 2,000 genes were differentially expressed between CMV TKO and CGRP TKO primary tumors (Fig. 4E; Supplementary Table S3). Genes that were more highly expressed in CGRP TKO tumors than in NFIBlow CMV TKO tumors were enriched for gene ontology annotations associated with neuronal differentiation, including synaptic signaling (Fig. 4F; Supplementary Fig. S9 and Supplementary Table S4). Notably, although neuronal gene sets were enriched in both CMV TKO metastases and CGRP TKO primary tumors, the genes driving these annotations had little overlap and the neuronal gene programs enriched in CGRP TKO primary tumors relative to NFIBlow CMV TKO primary tumors were specifically related to a more mature neuronal state (Fig. 4G; Supplementary Fig. S9). Together, these data (Fig. 4H) underscore the distinct molecular paths taken by SCLC tumors in the CMV and CGRP TKO models.
To relate the gene-expression states of CMV TKO and CGRP TKO tumors to human SCLC, we merged human SCLC data sets (13) and our mouse SCLC data sets and performed PCA. Many principal components seem to be driven primarily by the CMV versus CGRP tumor types, and human tumors were distributed broadly with some being more similar to CGRP TKO tumors and others being more similar to CMV TKO tumors (Supplementary Fig. S10).
Nonmetastatic CMV TKO Primary Tumors Contain Cells with Diverse Lung Epithelial Lineage Markers
Although tumors and metastases in both the CMV and the CGRP TKO models express neuroendocrine genes, CGRP TKO primary tumors and metastases had a slight overall higher expression of neuroendocrine programs compared with CMV TKO primary tumors or metastases (Supplementary Fig. S11). In contrast, compared with CGRP TKO primary tumors, CMV TKO primary tumors specifically had higher expression of lineage markers of several diverse lung epithelial cell types, including club cells (e.g., Scgb1a1/CC10 and Scgb3a2), alveolar type II cells (e.g., Sftpb and Lamp3), and alveolar type I cells (e.g., Aqp5 and Ager; Fig. 5A; Supplementary Table S3). Neither SCLC subtype highly expressed canonical markers of basal cells or lineage-negative epithelial progenitors (LNEP; Krt5 and Trp63) or ciliated cells (Foxj1; Supplementary Fig. S12A). Thus, although CMV TKO tumors still express neuroendocrine markers, they also express a wider range of lung epithelial markers.
Many of these lineage genes have diverse expression in human SCLC, consistent with the intertumoral heterogeneity uncovered in the mouse SCLC models (Supplementary Fig. S12B). At the protein level, primary tumors and metastases from both mouse models expressed neuroendocrine markers, including ASCL1 and UCHL1 (Supplementary Fig. S1B; Fig. 5B). However, as anticipated from our RNA-seq analysis, other markers including CC10 and SELENBP1 (a marker of bronchiolar and alveolar non-neuroendocrine cells) were specifically expressed in CMV TKO nonmetastatic primary tumors (Fig. 5B–D; Supplementary Fig. S12A and S13A–S13C). Cells expressing these non-neuroendocrine markers were uniquely present in the NFIBnegative areas within primary CMV TKO SCLC (Supplementary Fig. S13D).
Interestingly, CC10 and SELENBP1 were highly expressed in only a subset of cells in CMV TKO primary tumors (Fig. 5B; Fig. S13C). Because CC10 and SELENBP1 are also expressed in normal lung epithelial cells (33, 34), we determined whether these cells were cancer cells or normal cells within the tumors. In CMV-Cre–initiated tumors in TKO;Motley mice, CC10positiveand SELENBP1positive existed within the clonal cancer cell population, indicating that these cells are cancer cells (Fig. 5E; Supplementary Fig. S13E).
Together, these analyses further underscore the difference between CMV TKO and CGRP TKO primary tumors and uncover the presence of subsets of cancer cells expressing markers of multiple lung epithelial lineages in CMV TKO primary tumors.
The Two Distinct Subtypes of Mouse SCLC Arise from Different Cell Types of Origin
The simplest explanation for the differences observed during tumor progression of the CMV TKO and CGRP TKO models is that the tumors are initiated from different cell types. Ad-CMV-Cre and Ad-CGRP-Cre both lead to expression of Cre in neuroendocrine cells but Ad-CMV-Cre also leads to the expression of Cre in many other cell types in the lung (20, 21). The development of fewer tumors in CGRP TKO mice than in CMV TKO mice, even with a higher titer of Ad-CGRP-Cre (Supplementary Fig. S1A), supports the idea that SCLC is initiated from rare neuroendocrine cells in the CGRP TKO model but from a larger pool of cells in the CMV TKO model.
Neuroendocrine cells are often present at bifurcation points in bronchioles as well as in larger airways but are not normally found in alveoli (27, 35, 36). Accordingly, CGRP TKO tumors grow predominantly within the proximal airways, in both the large and small bronchioles. In contrast, CMV TKO tumors grew in both the proximal and distal lung with early-stage CMV TKO lesions located at large and small bronchioles, terminal bronchioles, and at bronchial–alveolar duct junctions (BADJ; Fig. 6A; Supplementary Fig. S14).
In addition to the spatial analysis, we also quantitatively reevaluated the potential of the major lung cell types to give rise to SCLC. Cancer initiation with Ad-CMV-Cre is more than 50-fold more efficient than adenoviral vectors with tissue-specific promoters targeting Cre expression to CGRPpositive NE cells, SPCpositive cells (mostly alveolar type II cells), and CC10positive cells (mostly club cells). The high number of tumors initiated by Ad-CMV-Cre relative to other vectors was further confirmed by analysis at late stage (10 months after tumor initiation). None of the cell types that we targeted recapitulate the high tumorigenic efficiency of Ad-CMV-Cre (Fig. 6B and C; Supplementary Fig. S15A–S15B).
Collectively, although our data support the idea that different cell types of origin provide the most likely explanation for the difference in tumor number and metastatic programs between the CMV TKO and CGRP TKO mouse models, we carefully investigated other variables. First, the time of tumor development was not a major determinant of the acquisition of NFIB-driven metastatic ability, as CMV TKO and CGRP TKO mice analyzed at the same time point after tumor initiation still had NFIBpositive and NFIBnegative metastases, respectively (Supplementary Fig. S16A–S16B). Second, incomplete recombination of the conditional alleles due to lower Cre expression from one of the viral vectors was largely ruled out by PCR genotyping of the floxed alleles in CMV TKO and CGRP TKO tumors (Supplementary Fig. S4B). Third, although 10- to 20-fold higher titers of Ad-CGRP-Cre were used to initiate tumors, a nonspecific effect of overall adenoviral titer did not explain our findings: Mice with tumors initiated with Ad-CMV-Cre combined with high titers of control Ad-CMV-GFP or Ad-CC10-Cre still developed CC10positive primary tumors and NFIBhigh metastases (Supplementary Fig. S16C–S16E). Finally, in the CMV TKO model, Cre is expressed broadly in the lung, leading to the generation of many “normal” epithelial cells that lack the floxed tumor suppressor genes. The coexistence of many other cells with these genetic alterations did not contribute to the high number of tumors initiated by CMV-Cre, as TKO;mTmG mice transduced with Ad-CGRP-Cre with or without the addition of Ad-CC10-Cre or Ad-CC10-Cre and Ad-SPC-Cre still developed comparable and low numbers of tumors (Fig. 6C; Supplementary Fig. S15C).
Thus, the most likely explanation for the difference in tumor number and progression remains that Ad-CMV-Cre leads to tumor suppressor inactivation in one or more cell types that do not express Cre after transduction with Ad-CGRP-Cre. These cells ultimately give rise to SCLC tumors that are sufficiently molecularly distinct to result in strikingly different evolutionary paths toward metastasis (Fig. 6D).
Discussion
SCLC is an aggressive cancer type and patients often present or relapse with widespread metastatic disease. Our data from defined genetically engineered mouse models indicate that SCLC is not inherently metastatic and that key genetic and epigenetic changes occur during cancer progression (this study and refs. 5–7, 37). Furthermore, we provide evidence that SCLC can arise from different cell types, which has profound influences on the course of tumor evolution.
NFIB is highly expressed in more than 50% of human SCLC metastases (refs. 6, 7, 13; Fig. 1G; Supplementary Fig. S2), suggesting that upregulation of this transcription factor could influence the metastatic ability of a large fraction of human SCLC. Deletion of Rb/Trp53 or Rb/Trp53/p130 in mice after delivery of Ad-CMV-Cre to the lungs may provide an accurate model for this subset of human tumors. SCLC initiated specifically from mature neuroendocrine cells using Ad-CGRP-Cre may model a different subset of human SCLC that does not amplify or upregulate Nfib as often and thus may metastasize via different mechanisms. We cannot exclude, however, that specific combinations of genetic alterations in mature neuroendocrine cells may affect the frequency of Nfib upregulation during SCLC progression. Notably, although SCLC in humans often develops within the lobar or main bronchi and is identified by imaging within the central aspect of the chest, 5% to 15% of cases of SCLC are more peripheral or even subpleural (38–41). These peripheral tumors are histologically similar to central SCLC and are also highly metastatic (42). The more peripheral location of tumors in the CMV TKO model suggests that CMV TKO mice may model this poorly characterized subset of human tumors.
Why CGRP TKO tumors and a fraction of human SCLC tumors metastasize without upregulation of NFIB is unknown, especially when this factor can be such a strong oncogene and prometastatic driver. We did not observe large-scale differences in genome-wide chromatin accessibility between CMV TKO and CGRP TKO primary tumors, suggesting that cancer cells in both models could be responsive to NFIB. However, small changes in accessibility at specific loci and/or other chromatin differences may render cancer cells in CGRP TKO tumors less responsive to NFIB upregulation. SCLC derived from mature neuroendocrine cells may be in a more differentiated state that is less amenable to epigenetic transformation by NFIB. The Nfib locus itself could be less frequently amplified or the expression of Nfib could be regulated in a different manner in SCLC derived from mature neuroendocrine cells. Finally, it is possible that cofactors required for NFIB action are not present in SCLC derived from CGRPpositivemature neuroendocrine cells. Interestingly, some primary tumors in the CGRP TKO model do express high levels of NFIB (Fig. 1E). The expansion of NFIBhigh cancer cells in these tumors could suggest that NFIB still has oncogenic properties in this cellular context, but without promoting metastasis.
Whether the molecular paths taken by distinct SCLC tumors affect their response to specific treatments remains an important question. In culture, we did not identify significant differences in the responses of cell lines derived from NFIBhigh CMV-Cre and NFIBlow CGRP-Cre tumors to three different drugs: cisplatin, etoposide, and a CHK1 inhibitor. In general, these cell lines had diverse sensitivities to these drugs. Even given this diversity, CGRP TKO cell lines had a trend toward higher IC50 values for each drug (Supplementary Fig. S17). Future studies, including in vivo studies on the genetically engineered mouse models and/or in clinical trials, will be required to determine whether different SCLC subtypes define therapeutic responses to immunotherapy, epigenetic therapies, or other treatments.
In addition to recent observations that heterogeneity within individual tumors can play a role in the SCLC progression (14, 16, 18, 19), accumulating evidence supports the notion that SCLC is not one disease, but rather many diseases that are distinguishable only at the molecular level (12, 13, 17). Across many cancer types, the genetic and epigenetic changes that occur during tumor evolution are a major focus of efforts to understand this intertumoral heterogeneity. However, the impact of the cell type of origin on cancer initiation and progression has garnered less interest (43–47). Distinct cell types may give rise to different tumor subtypes with distinct histologic features; however, in the case of SCLC, although the two subtypes metastasize using different mechanisms, the resulting metastases were histologically indistinguishable.
Our results indicate that fundamental differences in tumor development between the two mouse models may be endowed upon the tumor by the cell type of origin. The identity of the cell type(s) of origin in the CMV TKO model remains unknown. Previous work using Rb1flox/flox;Trp53flox/flox DKO mice and our analysis of Rb1flox/flox;Trp53flox/flox;p130flox/flox TKO mice suggests that lung cell types expressing CC10 or SPC (including club cells and alveolar type II cells) are unlikely to be cell types of origin for SCLC (4, 20, 21). Based on the expression of classic markers of several lung epithelial lineages specifically in CMV TKO primary tumors, including markers of club cells and alveolar type II and I cells, we speculate that SCLC in this model arises from stem/progenitor cells or facultative stem/progenitor cells that gain multilineage potential upon inactivation of Rb1 and Trp53. Indeed, RB1 loss promotes cellular plasticity (48) and can lead to neuroendocrine differentiation/transdifferentiation in lung adenocarcinoma and prostate cancer (49–51). The formal identification of this cell type or types will require new tools to express Cre specifically in defined subpopulations of lung epithelial cells (23, 24, 52).
Patients with SCLC continue to have one of the worst survival rates of all patients with cancer. Our work indicates that multiple cell types in the lung can serve as the cell of origin of SCLC and suggests that biomarkers of these different subsets of tumors may help predict their evolutionary trajectories toward malignancy. Understanding this diversity may ultimately have value in enabling better patient stratification and more precise treatment plans.
Methods
Ethics Statement
Mice were maintained according to practices prescribed by the NIH at Stanford's Research Animal Facility accredited by the American Association for Accreditation of Laboratory Animal Care. All animal studies were conducted following approval from the Stanford Animal Care and Use Committee (protocol 13565).
Mice, Adenoviral Infections, and Cancer Cell Isolation
Rb1flox, Trp53flox, p130flox, R26mTmG, and R26Motley alleles have been described (7, 26, 31, 53, 54). The SCLC mouse model bearing deletions of Trp53, Rb, and p130 was previously described (26). Ad-CMV-Cre, Ad-CGRP-Cre, Ad-SPC-Cre, Ad-CC10-Cre, and Ad-CMV-GFP were purchased from the University of Iowa viral vector core. Intratracheal administration was performed as previously described (7, 26). Unless specified in the text, for long-term experiments, we used Ad-CMV-Cre at 4 × 107 Pfu. Ad-CGRP-Cre or Ad-CC10-Cre were used at 4 × 108 or 8 × 108 Pfu per mouse, respectively. Primary tumors and metastases were harvested when mice became moribund. Mice were maintained at the Stanford Research Animal Facility accredited by the Association for Assessment and Accreditation of Laboratory Animal Care. Cancer cells were isolated and purified by FACS as previously described (7).
Cell Culture
SCLC cell lines were passaged as previously described (7). For drug response assays, cells were plated in a 96-well plate at 10,000 cells per well, in triplicate, in 100 μL of media. After 24 hours, drugs or vehicle control were added to the cells at a 2× concentration in 100 μL, for a total volume of 200 μL per well. For each drug, seven different concentrations were tested at 10-fold dilutions. Forty-eight hours after the drugs were added, 20 μL of Alamar Blue was added to each well, and fluorescence values were measured 4 hours later. IC50 values were calculated using GraphPad Prism7. Cisplatin and etoposide were obtained from the Lucile Packard Children's Hospital at Stanford. The CHK1 inhibitor LY2606368 was a gift from the lab of Dr. Lauren Byers (MD Anderson Cancer Center) and reconstituted in DMSO.
Immunoassays
Protein levels of NFIB and ASCL1 for the tumor-derived cell lines used in IC50 experiments were determined using the Simple Western quantitative immunoassay and the Compass software, according to the manufacturer's protocol. Cells were lysed in TNESV lysis buffer (50 mmol/L Tris–HCl at pH 7.6, 1% IGEPAL, 20 mmol/L EDTA at pH 8.0, 100 mmol/L NaCl), supplemented with proteasome and phosphatase inhibitors, and lysates were cleared by centrifugation at maximum speed for 10 minutes. Total protein was quantified using the Pierce BCA Protein Assay Kit (Thermo Fisher, cat. #23277). Whole-cell lysates were diluted to a final concentration of 0.2 μL/mL. The antibodies and dilutions used were as follows: NFIB (Abcam, ab186738) 1:20,000; ASCL1 (BD Pharmingen, 556604) 1:1,000; and HSP90 (Cell Signaling Technology, 4877S) 1:2,000.
Histology and IHC
Tumor samples were fixed in 4% formalin and paraffin embedded. Hematoxylin and eosin staining was performed using standard methods. Images were quantified using ImageJ. For IHC, we used antibodies to NFIB (1:1,000; Abcam ab186738), UCHL1 (1:500; sigma HPA005993), CC10 (CCSP; 1:1,000; Millipore 07-623), SELENBP1 (1:200; Abcam), GFP (1:500; Abcam ab6673), ASCL1 (1:200; BD Biosciences, 556604), and RFP (1:500; Rockland).
SCLC Patient Brain Metastasis Sections and IHC Staining Score
Human SCLC brain metastasis tissue sections were collected by Anna S. Berghoff and Matthias Preusser from the Medical University of Vienna, Austria. IHC staining for NFIB (1:1,000; Abcam ab186738) was performed on 4-μm formalin-fixed paraffin-embedded sections. NFIB expression was scored by a board-certified pathologist, C.S. Kong, as follows: 0, negative or weak staining of less than 10% of cells; 1, weak staining of more than 10% of cells; 2, moderate intensity staining; 3, strong intensity staining.
Immunofluorescence
FACS-sorted cancer cells from primary tumors, DTCs, and liver metastases were cytospun onto glass slides at 500 rpm for 5 minutes. Cells were fixed with 4% paraformaldehyde for 15 minutes and stained for NFIB (Abcam; ab186738; 1:2,000) and with a goat anti-rabbit secondary antibody (Invitrogen). For imaging, membrane GFP staining was confirmed to indicate a DTC, and NFIB expression was checked through the far-red channel using a fluorescence scope (Leica). NFIB staining was quantified by counting directly under the scope. On average, we quantified 30 to 100 cells per sample based on how many cells were harvested.
Whole-Mount Immunofluorescence Staining
Lungs were dissected from mice after perfusion with ∼5 mL PBS into the right cardiac ventricle and intratracheal inflation with ∼2 mL 2% low melting point agarose (Thermo Fisher; cat. #16520) in PBS. The lungs were fixed in 4% paraformaldehyde (EMS; cat. #15714) for 6 hours at 4°C and then sectioned with a vibrating blade microtome (Leica Biosystems; cat. #VT1000S) at 500-μm thickness. Lung slices were stained with rabbit anti-CGRP (Enzo Life Sciences; cat. #BML-CA1137; 1:1,000 dilution) for 5 to 7 days at 4°C, then with Alexa Fluor 647–conjugated anti-rabbit IgG (Thermo Fisher; cat. #A-21244; 1:500 dilution) and streptavidin-conjugated Alexa Fluor 405 (Thermo Fisher; cat. #S32351, 4 μg/mL final concentration) as a pan-lung epithelial identifier (55) for 3 to 4 days at 4°C. Finally, stained sections were optically cleared using the CUBIC method (56), comprised of a 3-hour incubation at room temperature in CUBIC 1 reagent and long-term storage in CUBIC 2 at 4°C. Sections were imaged using a Zeiss LSM 780 laser scanning confocal microscope with an inverted 5× air objective (Carl Zeiss AG, NA = 0.45), and optical sections were collected at 10 μm resolution.
ATAC-seq Library Preparation, Sequencing, and Analysis
ATAC-seq library preparation was performed as described (7).
Calling Peaks
Accessible regions were called by performing peak calling to obtain peak summits, merging peak calls obtained in different sequencing batches, and subsequently filtering to remove any overlapping regions. Macs2 (v 2.1.0.20140616) was used to call peak summits on the merged bam file of each sequencing batch of samples (three total) using the command “macs2 callpeak –nomodel –call-summits –keep-dup all.” Peak summits were expanded by 250 bps on each side to form 500-bp windows. Each set of accessible regions was filtered to remove blacklist regions (mm9; https://sites.google.com/site/anshulkundaje/projects/blacklists) and copy-number amplification (any peak that overlapped a region that differed from expected for any sample by more than 2-fold (up or down) was removed; copy-number amplification was determined as described below). Finally, the three peak sets were concatenated, while retaining overlapping peak windows.
Initially, all accessible regions were filtered to obtain only regions with sufficient read depth to compare across samples. The number of overlapping reads per peak was found for all samples using the bedtools multicov module (v2.25.0), using only reads with high mapping quality (−q 30). The read counts per peak for each sample was normalized by the mean read counts per peak for that sample, such that all samples have the same mean reads per peak. Peaks were kept if they fulfilled either of two criteria: (i) the peak had at least 1 read for all samples or (ii) the peak had at least 1 sample with a large number of normalized reads (>1 standard deviation above the mean). This operation was performed to filter out peaks with low counts while retaining peaks that may be differentially accessible.
Finally, any overlapping windows were filtered to obtain a set of nonoverlapping windows. Overlapping windows were identified using the bedtools cluster module. For each set of overlapping windows, only the one with the greatest number of normalized reads was kept. In the end, ∼114,000 accessible regions (nonoverlapping, uniform width) were obtained and used in subsequent analyses.
Normalizing Read Counts within Accessible Regions
The number of reads per peak was obtained to assess the differential chromatin landscape across samples. To estimate the effect of sequencing depth, a set of “housekeeping” regions was obtained that was expected to have similar mean accessibility across samples. These “housekeeping” regions were determined by finding the set of genes that were uniformly expressed across CMV tumors and metastases (as in Denny and colleagues; ref. 7); peaks corresponding to the promoters of these genes were determined to be “housekeeping” peaks (Supplementary Table S2). The inverse of the mean read count across all “housekeeping” peaks was used as the sample-specific size factor for normalization in DESeq2 (57). These factors were divided by the geometric mean of all the size factors, as recommended in the DESeq2 vignette (https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#sample-gene-dependent-normalization-factors). Size factors were converted to peak-specific normalization factors by using the sample-specific size factor for all peaks.
Number of Reads Expected from Background Fragmentation
The number of reads expected to come from background fragmentation during transposition was obtained by counting reads overlapping a set of background intervals (entire genome, excluding blacklist regions, and excluding any 10-Mb interval with evidence of copy-number amplification in any sample), to form a total number of reads M for any sample. The number of reads within accessible regions for each sample (m) was subtracted from M and divided by the number of base pairs in the background intervals (L), less the number of base pairs in accessible regions (l), such that the number of background reads per base pair per sample was |$b = ( {M - m} )/( {L - l} )$|. For accessible region k of size lk, the number of reads expected simply from background fragmentation was |$\;{n_k} = {l_k}b$|. In practice, because all of the accessible regions were of uniform size, all accessible regions had the same lk.
To obtain background-subtracted read counts within accessible regions, nk was subtracted from the number of reads overlapping each accessible region for each sample. These subtracted values were rounded to the nearest integer, and any intervals with negative read counts were rounded up to zero. For all samples with enrichment score >11, the number of background reads per each peak was less than ∼5 (normalized for sequence depth).
Assessing Differential Accessibility
Through unsupervised clustering and PCA, we determined that samples with low enrichment (calculated as the maximum number of reads around transcription start sites over the minimum number of reads, as in ref. 7) were associated with differences in chromatin accessibility. This effect was associated with batch and not with Nfib amplification (Supplementary Fig. S5). To minimize this technical artifact, samples with low enrichment (<11) were filtered from the analysis.
Differential accessibility was determined using the package DESeq2 (57). Counts per peak were background subtracted to account for expected counts per peak due to background fragmentation (see the above section). Each sample was annotated by the model (CGRP vs. CMV), the metastatic state (either T for primary tumor or L for distal metastases (liver or lymph node), and the Nfib amplification state (1 if amplified, 0 if unamplified). This annotation can be found in the column “model_met_amp” in Supplementary Table S5. The model used in DESeq was then “∼model_met_amp.” For the CGRP distal metastases versus primary tumor comparison, results were extracted using the command “res_G_LvsT ← results(dds, contrast=c('model_met_amp’, 'G_L_0′, 'G_T_0′))” (Supplementary Table S5).
Assessing Copy-Number Amplification from ATAC-seq Data
Copy-number amplification was estimated from ATAC-seq data by determining read counts in large (10 Mb) intervals across the genome (2 Mb step size) that have different read counts than expected, where the expectation is based on the average read count across 100 GC-matched intervals. Specifically, the GC content of each interval was found using bedtools nuc module. For each interval, the 100 other intervals with the closest GC content to that region were found and used as the GC-matched intervals.
RNA-seq Library Preparation and RNA-seq Analysis
RNA was extracted from FACS-sorted cancer cells using Qiagen Allprep DNA/RNA kit. All the samples were checked by BioAnalyzer (Agilent) for RNA integrity (RIN > 8). cDNA was prepared from 10 ng of RNA using the Nugen Ovation V2 kit, according to the manufacturer's instructions. RNA-seq libraries were generated using the Illumina Truseq kit.
We used Kallisto to quantify expression (58). We kept only transcripts with max transcript per million (TPM) value greater than 1 in at least one sample and with standard deviation greater than 1. We performed the asinh() transformation to variance stabilize the data.
Because the Illumina library preparation method had been slightly modified between the two library preparations, we observed a strong batch effect when comparing the two data sets. To correct for the batch effect, we included three RNA samples of our old CMV TKO samples while generating the CGRP TKO libraries.
We used SVA to remove batch effects from the data (59). To identify batch effects between the two categories of samples, we strategically incorporated three overlapping samples between the CGRP TKO and CMV TKO samples. These otherwise identical samples give an idea of what differences can arise purely from experimental procedure and other confounding effects. We used SVA to estimate confounding factors by protecting the sample name so that the algorithm only removed batch effects confounding the identical samples and otherwise knew nothing about the identity of the samples. We then removed the estimated confounding factors from the data. After batch correction, the batch difference of the three repeated samples was comparable to the difference between technical replicates generated in a single RNA-seq experiment.
Differential Expression and Gene Ontology Analysis
To discover the genes driving transitions between different stages of metastasis, we used the following procedure on data with the experimental batch corrected (see previous section): (i) fit a linear model (lmFit in R) modeling metastatic effect, (ii) rank genes in order of evidence for differential expression using an empirical Bayes method (60). We compared only genes with nonzero variance. We compared CGRP primary tumors and CGRP metastases, CGRP primary tumors and CMV primary tumors, CMV primary tumors and CMV metastases, and CMV metastases versus CGRP metastases. Significantly differential genes were those that had an adjusted P value below 0.05. Gene ontology analysis and neuroendocrine signature enrichment analysis were performed using gene set enrichment analysis (61).
Accession Numbers
The accession number for the RNA-seq data reported in this paper is GEO: GSE116977, and for the ATAC-seq data is GEO: GSE117177.
Comparison with Human Samples
We downloaded human SCLC RNA-seq studies EGAD00001001431 and EGAD00001001244 from the European Genome–Phenome Archive. We quantified gene expression for all samples (n = 74) from the two studies again using Kallisto to generate TPM for each sample.
In order to compare expression values for human samples and mouse samples, we subset all data sets to use only genes that shared the same gene names between the two species (n = 14,813 genes). After merging human and mouse data sets on shared genes, we performed the asinh transformation for variance stabilization.
We then performed PCA on this merged data set. PCA allows us to inspect different axes of variation, such as those driven by species, cell type of origin, and metastatic state without further transforming human and mouse data to make them comparable. We plotted PC1 versus PC2, PC1 versus PC3, and PC2 versus PC3. We found that PC1, as expected, is driven primarily by differences between mouse and human samples. We found that PC3 seems to be driven primarily by tumor type with a clear continuum among mouse samples from CGRP to CMV Met samples. We used this PC3 in order to understand how the distribution of human samples falls relative to the mouse distribution of samples.
Disclosure of Potential Conflicts of Interest
H.C. Reinhardt reports receiving consulting and lecture fees from AbbVie, AstraZeneca, Vertex, and Merck, and research funding from Gilead Pharmaceuticals. A. Kundaje is on the scientific advisory board of Epinomics. W.J. Greenleaf is the scientific co-founder of Epinomics. J. Sage reports receiving research funding from AbbVie. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: D. Yang, J. Sage, M.M. Winslow
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): D. Yang, S.K. Denny, A.C. Chaikovsky, J.J. Brady, Y. Ouadah, N.S. Jahchan, J.S. Lim, C.S. Kong, A.S. Berghoff, A. Schmitt, H.C. Reinhardt, M. Preusser
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): D. Yang, S.K. Denny, P.G. Greenside, A.C. Chaikovsky, Y. Ouadah, J.M. Granja, S. Kwok, C.S. Kong, A. Kundaje, M.M. Winslow
Writing, review, and/or revision of the manuscript: D. Yang, S.K. Denny, P.G. Greenside, A.C. Chaikovsky, J.S. Lim, C.S. Kong, H.C. Reinhardt, M. Preusser, A. Kundaje, J. Sage, M.M. Winslow
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): S. Kwok, K.-S. Park, W.J. Greenleaf
Study supervision: A. Kundaje, J. Sage, M.M. Winslow
Acknowledgments
We thank Pauline Chu for technical assistance, and the Stanford PAN Facility and Stanford Shared FACS Facility; Shin-Heng Chiou, Ian Winters, Barbara Grüner, Astrid Gillich, Christopher Murray, Sandra Cristea, Gokul Ramaswami, and Rosanna Ma for sharing reagents and technical help; and David Feldser, Laura Attardi, Mark Krasnow, Tushar Desai, and members of the Winslow and Sage laboratories for helpful comments. This work was supported by a Stanford Cancer Institute Cancer Biology Seed Grant (to M.M. Winslow, W.J. Greenleaf, and J. Sage), National Cancer Institute R01CA206540 (to J. Sage) and R01CA194461 (to K.S. Park), the German-Israeli Foundation for Research and Development (I-65-412.20-2016 to H.C. Reinhardt), the Deutsche Krebshilfe (70113041, 1117240, and 70113041 to H.C. Reinhardt), and the German Ministry of Education and Research (BMBF e:Med 01ZX1303A to H.C. Reinhardt). D. Yang was supported by a Stanford Graduate Fellowship and by a TRDRP Dissertation Award (24DT-0001). S.K. Denny was supported by the Stanford Biophysics training grant (T32 GM008294). A.C. Chaikovsky was supported by the Stanford Cancer Biology training grant (T32 CA009302). A. Kundaje was supported by NIH grant DP2OD022870. P.G. Greenside was supported by the Bio-X Stanford Interdisciplinary Graduate Fellowship (SIGF). Y. Ouadah was supported by a Stanford Graduate Fellowship. S.K. Denny, A.C. Chaikovsky, and Y. Ouadah were supported by the NSF GRFP. J.S. Lim was supported by an A*STAR scholarship (Singapore). J. Sage is the Harriet and Mary Zelencik Scientist in Children's Cancer and Blood Diseases.