Abstract
APOBEC3 enzymes are cytosine deaminases implicated in cancer. Precisely when APOBEC3 expression is induced during cancer development remains to be defined. Here we show that specific APOBEC3 genes are upregulated in breast ductal carcinoma in situ, and in preinvasive lung cancer lesions coincident with cellular proliferation. We observe evidence of APOBEC3-mediated subclonal mutagenesis propagated from TRACERx preinvasive to invasive non–small cell lung cancer (NSCLC) lesions. We find that APOBEC3B exacerbates DNA replication stress and chromosomal instability through incomplete replication of genomic DNA, manifested by accumulation of mitotic ultrafine bridges and 53BP1 nuclear bodies in the G1 phase of the cell cycle. Analysis of TRACERx NSCLC clinical samples and mouse lung cancer models revealed APOBEC3B expression driving replication stress and chromosome missegregation. We propose that APOBEC3 is functionally implicated in the onset of chromosomal instability and somatic mutational heterogeneity in preinvasive disease, providing fuel for selection early in cancer evolution.
This study reveals the dynamics and drivers of APOBEC3 gene expression in preinvasive disease and the exacerbation of cellular diversity by APOBEC3B through DNA replication stress to promote chromosomal instability early in cancer evolution.
This article is highlighted in the In This Issue feature, p. 2355
Introduction
Non–small cell lung cancer (NSCLC) is characterized by somatic copy-number and point mutation intratumor heterogeneity, whereby chromosomal instability (CIN) is associated with an increased risk of recurrence or death (1). Large-scale genomic sequencing studies have implicated APOBEC3 enzymes in somatic mutagenesis, mediating the mutational signatures single-base substitutions (SBS) 2 and 13 in cancer genomes (2). These enzymes form a barrier against viral and transposon replication through cytosine deaminase–dependent and –independent mechanisms (3). However, APOBEC3 family members have also been implicated in the generation of C>T and C>G DNA point mutations in 5′-TCA-3′ and 5′-TCT-3′ trinucleotide motifs during cancer evolution (4, 5). The clinical importance of APOBEC3 in cancer is of broad interest because of its associations with treatment outcome (6), patient outcome (7, 8), point mutation heterogeneity (9, 10), enrichment of APOBEC3 signature mutations in metastases (9, 11, 12), the genesis of oncogenic driver mutations (4, 13, 14), and the immune response (15, 16). Currently, APOBEC3A (A3A) and APOBEC3B (A3B) are thought to be the principal family members involved in APOBEC3-mediated cancer mutagenesis (17–19).
Several pathways have been identified that drive A3A and A3B gene expression in cancer, including the DNA damage response (20, 21), the PKC–NFκB pathway (22), and the interferon signaling pathway (9). In NSCLC, APOBEC3-mediated mutations are widespread and are enriched as subclonal mutations (1). As a consequence, APOBEC3-mediated mutagenesis is thought to be a late mutagenic process, enriched in tumor subclones in NSCLC (1, 9). How early branched evolution and subclone dispersion occurs in NSCLC is unclear. In breast cancer, this process can already occur in preinvasive ductal carcinoma in situ (DCIS) lesions (23, 24). Although some sequencing studies have reported APOBEC3 signature mutations in preinvasive lung disease (25–28), the timing and consequences of APOBEC3 induction during cancer evolution are currently unclear.
Using a combination of integrated molecular and clinical analysis of a comprehensively annotated prospective cohort study of patients with NSCLC and preinvasive disease in the TRACERx [TRAcking Cancer Evolution through therapy (Rx)] study (1) and preinvasive lung squamous cell carcinoma (LUSC) lesions (29), as well as published data sets, in vitro and in vivo experimentation, we characterize APOBEC3 family mRNA and protein expression during NSCLC and breast cancer evolution and investigate the underlying causes and consequences of APOBEC3 expression in cancer evolution.
Results
APOBEC3 Expression Increases in Preinvasive Lung and Breast Cancer Lesions
We initially investigated APOBEC3 expression during NSCLC evolution by IHC staining of sections from different tumor morphology–defined progression stages with a monoclonal antibody that detects a shared epitope in A3A, A3B, and A3G (30). Control experiments demonstrate that this protocol can detect A3B protein in formalin-fixed paraffin-embedded (FFPE) samples, because nuclear APOBEC3 staining was absent in SKBR3 cells (A3B-null) but was strong in HCC1954 cells (A3B-high; Supplementary Fig. S1A). Weak or no nuclear immunoreactivity was observed in histologically normal bronchial (Fig. 1A) and alveolar epithelium (Fig. 1B). The presence of stromal cells with cytoplasmic APOBEC3 staining below the epithelial lining (Fig. 1A; orange arrow) is indicative of A3G and A3A proteins, but not A3B, which is reported to be extensively nuclear (30–32). We observed more than 10% nuclear positivity in 1% (1 of 94) of normal tissue sections, in 93% (62 of 67) of preinvasive samples, and in 52% (47 of 90) of NSCLC samples, respectively (Fig. 1C). The preinvasive lesions showed the highest APOBEC3 immunoreactivity (Fig. 1A–C; two-tailed Fisher exact test, P ≤ 0.0001), indicating that APOBEC3 expression varies during NSCLC evolution and may peak in preinvasive lesions. Among the 67 preinvasive lesions, 7 were classified as severe dysplasia, 16 as less severe dysplasia, 18 as carcinoma in situ (CIS), 7 as atypical adenomatous hyperplasia (AAH), 13 as adenocarcinoma in situ (AIS), and 6 as minimally invasive adenocarcinoma (MIA). All 7 severely dysplastic lesions showed 60% to 75% nuclear APOBEC3 positivity, whereas the less severe dysplastic lesions showed anywhere between 2% and 50% nuclear APOBEC3 positivity. All 18 CIS lesions showed >75%, 4 of 7 AAH lesions exhibited 10% to 20% and the remaining 3 AAH lesions showed 2% to 10% nuclear APOBEC3 positivity, whereas both the AIS (N = 13) and MIA (N = 6) lesions showed a heterogeneous pattern with overall 10% to 20% nuclear positivity.
We sought to confirm our IHC analysis by deciphering the timing of the full repertoire of APOBEC3 gene-expression changes during cancer evolution in expression studies (see Supplementary Table S1) that assayed multiple morphologic stages in the progression from normal tissue to established lung adenocarcinoma (LUAD; ref. 33) or LUSC (34). Within the LUAD data set (33), we observed increased expression of A3B in AIS and LUAD relative to normal lung tissue, whereas A3A expression decreased in the invasive stage relative to normal lung tissue (Fig. 1D; linear mixed-effects model, FDR ≤ 0.1; LUAD, FDR ≤ 0.05). Within the LUSC data set (34), significant increases in A3A, A3B, and A3F expression were detected in the stages from moderate dysplasia to invasive carcinoma compared with normal tissue samples (Fig. 1E; linear mixed-effects model A3A, FDR ≤ 0.05; A3B, FDR ≤ 0.01; A3F, FDR ≤ 0.05). IHC quantification of A3A and A3B transcripts using BaseScope in a small set of samples, including normal, dysplastic, CIS, and NSCLC tissue, confirmed an increase in A3A and A3B expression in CIS and NSCLC samples relative to normal lung tissue (Supplementary Fig. S1B and S1C). We then sought to investigate whether early APOBEC3 expression changes are specific to NSCLC. We reanalyzed multiple independent data sets of gene expression changes during breast cancer progression (35–37). Similar to LUSC, data from breast cancer progression studies, including analyses from normal to either DCIS or invasive ductal carcinoma (IDC), demonstrate an increased expression of A3B at the DCIS and IDC stages in two of three data sets relative to normal breast tissue (Supplementary Fig. S1D–S1F; linear mixed-effects model, FDR ≤ 0.1; FDR ≤ 0.05; FDR ≤ 0.01). Taken together, these data suggest that APOBEC3 expression is dynamic and is upregulated during early lung and breast cancer development.
Next, we investigated the mutational signatures of APOBEC3-mediated mutagenesis in preinvasive and invasive NSCLC (Fig. 1F; Supplementary Fig. S1G). In two patients from the TRACERx study in whom rare synchronous preinvasive lesions and invasive tumors are clonally related (Fig. 1G), truncal mutations were found in the APOBEC3 context (Fig. 1H). A LATS1 driver event in an APOBEC3 context was ubiquitously found in the preinvasive and invasive lesions of patient CRUK0077. Patient CRUK0235 had a PTEN driver in an APOBEC3 context detectable in the preinvasive lesion and regions 1 and 4 of the invasive lesion, indicating early subclonal seeding of the primary tumor. However, we cannot exclude later subclonal diversification and seeding of the preinvasive lesion with a PTEN-mutant subclone. In addition, we observed subclonal APOBEC3 signature mutations that were unique (i.e., private) to the matched preinvasive or invasive lesions in both patients (Fig. 1H), indicating that APOBEC3-mediated polyclonal diversification can occur at the preinvasive stage prior to malignant transformation. Within the first 100 patients of the TRACERx cohort (1) we identified 487 driver mutations, of which 71 (15%) were in an APOBEC3 context. In total 37 of 100 tumors harbored at least 1 driver mutation in an APOBEC3 context, of which 26 tumors have clonal APOBEC3 driver events (Fig. 1I). These data strengthen our observations that APOBEC3 plays an important role in preinvasive lung cancer mutagenesis.
To further investigate which APOBEC3 family members might be affecting NSCLC progression, we studied the association between APOBEC3 gene expression and progression-free interval (PFI) in The Cancer Genome Atlas (TCGA) LUAD and LUSC data sets (Supplementary Fig. S1H–S1K). In patients with earlier-stage (I and IA) LUAD, we observed that higher A3B expression was associated with shorter PFI (Supplementary Fig. S1H; Wald test, A3B, P = 0.012), whereas this was not the case in patients with later-stage (IIIA, IIIB, and IV) LUAD (Supplementary Fig. S1I). Within patients with LUSC, higher A3A and A3F expression was associated with shorter PFI (Supplementary Fig. S1J; Wald test, A3A, P = 0.018; A3F, P = 0.012) but not in patients with later-stage (IIIA, IIIB, and IV) LUSC (Supplementary Fig. S1K). No other APOBEC3 genes were associated with significantly worse PFI (Supplementary Fig. S1H–S1K). Altogether, our data suggest that APOBEC3 plays an important role in early lung and breast cancer development and progression.
APOBEC3 Expression Increases during Either Replication Stress–Associated Senescence or Proliferation
Previously, we demonstrated that DNA replication stress can drive A3B expression (20). To explore the underlying basis of the high APOBEC3 expression observed in the preinvasive NSCLC (Fig. 1), we investigated potential mechanisms that induce APOBEC3 expression prior to the establishment of invasive cancer. Preinvasive lesions contain epithelial cells undergoing replication stress, and this could result in either senescence (38) or transformation and unchecked proliferation. Indeed, normal bronchial epithelium was negative for (phosphorylated) pRPA(S33), a common marker of replication stress, whereas > 75% of epithelial cells in all the examined preinvasive lesions (N = 5 AAH, N = 13 AIS, N = 6 MIA, N = 15 CIS) were pRPA(S33)-positive (Fig. 2A). Furthermore, APOBEC3 staining of adjacent sections revealed that almost all APOBEC3-positive cells were pRPA(S33)-positive (>95%; Fig. 2B). We next sought to clarify a potential link between APOBEC3 with either senescence or proliferation.
We identified senescent cells using a biotin-linked Sudan Black-B analogue (SenTraGor), which stains for lipofuscin-containing senescent cells (ref. 39; N = 81 normal lung samples, N = 25 preinvasive lesions, N = 85 NSCLC; Fig. 2C and D). In addition, we stained for APOBEC3 protein expression on consecutive sections of FFPE clinical samples (N = 81 normal lung samples, N = 23 preinvasive lesions, N = 85 NSCLC; Fig. 2E). As expected, the SenTraGor staining was more prevalent in preinvasive lesions (>10% positivity in 18 of 25 cases) relative to normal lung epithelium (>10% positivity in 0 of 81 cases) and invasive carcinoma (>10% positivity in 10 of 85 cases; Fig. 2B and D; two-tailed Fisher exact test, P ≤ 0.0001). The APOBEC3-positive lesions were enriched with patches of overlapping SenTraGor-positive cells in 5 of 81 histologically normal regions (6%), 21 of 23 evaluable preinvasive lesions (91%), and 18 out of 85 carcinomas (21%) as evaluated by the adjacent sections (Fig. 2E; two-tailed Fisher exact test, P ≤ 0.01, P ≤ 0.0001).
We hypothesized that replication stress–associated senescence may underlie this observed APOBEC3 upregulation, so we next monitored APOBEC3 expression using a previously described epithelial cell line model of oncogene-induced senescence (40). In the human bronchial epithelial cell (HBEC) CDC6 Tet-ON cell line, induction of CDC6 overexpression through prolonged doxycycline treatment triggers replication stress and subsequent senescence (40). As expected upon doxycycline exposure, HBEC CDC6 Tet-ON cells flattened, contained more vacuoles, and exhibited increased senescence-associated β-gal (SA-β-gal) staining (Supplementary Fig. S2A; Fig. 2F and G). Quantitative reverse transcription PCR (qRT-PCR) of APOBEC3 genes in HBEC CDC6 Tet-ON cells revealed a more than 15-fold increase in several APOBEC3 transcripts including A3A, A3B, A3C, A3G, and A3H mRNA expression by day 6 (Fig. 2H). The CDC6-induced increase in A3B transcript and protein levels were abrogated after 24-hour CHK1 kinase inhibition (Supplementary Fig. S2B and S2C), indicating that the increase in gene expression is dependent on the activation of a CHK1-mediated replication stress checkpoint (41). Interestingly, A3B protein levels in senescence-escaped HBEC-CDC6 cells returned to baseline (Supplementary Fig. S2D and S2E), potentially indicating adaptation.
To further assess whether senescence drives or is merely coincident with APOBEC3 expression, we utilized a recent study showing that microRNA146a (miR146a) is strongly upregulated in senescent cells (42). Ectopic expression of miR146a-EGFP, a reporter of senescence (42), in H2122 and H1944 lung cancer cell lines was carried out to investigate the relationship between senescence and APOBEC3 expression. We treated these cell lines with a high dose of hydroxyurea (HU; 1 mmol/L, 1–4 days) to induce replication stress and prolonged fork arrest in order to investigate the dynamics of senescence and APOBEC3 gene expression. In addition, we used a high dose of irradiation (IR; 8 Gy IR, 4 days recovery) as a positive control in order to initiate senescence through an excess of acute DNA double-strand breaks rather than DNA replication stress. Both HU- and IR-induced senescence-associated cell morphologic changes and increased SA-β-gal staining (Supplementary Fig. S2F–S2I) increased the miR146a-EGFP signal (Supplementary Fig. S2J and S2K) and reduced EdU incorporation (Supplementary Fig. S2L and S2M). Interestingly, HU but not IR increased APOBEC3 expression, despite both conditions inducing senescence at day 4 (Supplementary Fig. S2N and S2O). These data suggest that the induction of APOBEC3 expression and senescence are parallel pathways driven by replication stress.
The percentage of APOBEC3-positive cells was greater than that of senescent cells in preinvasive NSCLC (compare Fig. 1A–C and Fig. 2B–D), implying that despite overlapping in many cases (Fig. 2E), most APOBEC3-positive cells were nonsenescent. To elucidate whether proliferation could instead be implicated in APOBEC3 expression, a triple immunofluorescence stain for APOBEC3, Ki-67 (indicative of proliferation), and the cyclin-dependent kinase inhibitor p21 (indicative of cell-cycle arrest) in 12 CIS samples with available tissue (from ref. 29) showed that on average 39% (range, 13%–75%) of APOBEC3-positive cells were also positive for Ki-67, whereas only 4.8% (range, 0.2%–23.6%) were positive for p21 (Fig. 2I and J). For 79 NSCLC samples with available tissue from the TRACERx 100 cohort (1), a double immunofluorescence stain for APOBEC3 and Ki-67 (p21 was not evaluable) was performed (Fig. 2K). Similar to the CIS samples, a large proportion of APOBEC3-positive cells in NSCLC were also Ki-67–positive (Fig. 2K; average 30% of A3B-positive cells were Ki-67–positive, range, 5%–70%). Concordant with our earlier observation (Fig. 1C), preinvasive lesions contained more APOBEC3-positive epithelial cells than NSCLCs (Supplementary Fig. S2P; N = 12 CIS, N = 79 NSCLC, two-tailed Mann–Whitney test, P ≤ 0.0001).
To clarify which APOBEC3 gene family member best correlates with nuclear APOBEC3 immunofluorescence, we explored matched RNA sequencing (RNA-seq)–derived APOBEC3 expression within the TRACERx 100 cohort (1). We observed a stronger correlation with A3B (Supplementary Fig. S2Q; Spearman correlation, N = 54, ρ = 0.41; P = 0.0023) relative to A3A mRNA expression (Supplementary Fig. S2R; Spearman correlation, N = 54, ρ = 0.16; P = 0.24). There was no significant correlation between the percentage of APOBEC3-positive tumor cells and the subclonal, clonal, nor total number of APOBEC3 signature mutations (SBS2/SBS13) in either the CIS (from ref. 29) or NSCLC samples (from ref. 1; Supplementary Fig. S2S; Spearman correlation, N = 12 CIS and N = 79 NSCLC, ρ ≤ 0.17, P ≥ 0.14).
Next, to test whether cell proliferation was required for APOBEC3 expression in the absence of DNA damage, we densely seeded RPE-1 cells to facilitate contact inhibition. Contact inhibition in RPE-1 cells strongly reduced A3B mRNA expression, whereas subconfluent reseeding and cell-cycle progression elicited a rebound of A3B mRNA expression toward baseline (Supplementary Fig. S2T and S2U). Taken together, these data suggest that replication stress and cell-cycle progression drive A3B expression. In contrast, the senescence program does not induce A3B expression and is a parallel pathway downstream of replication stress. Thus, our data suggest that APOBEC3 is upregulated early during breast cancer and NSCLC development with DNA replication stress being a likely driver of APOBEC3 expression (Figs. 1 and 2).
A3B Exacerbates DNA Replication Stress
We next reasoned that, because APOBEC3 deaminates cytosines on ssDNA, it could itself induce replication stress as previously described (43–46), pointing toward a possible feed-forward loop driving CIN. Because A3A and A3B are the prime candidate APOBEC3 genes implicated in cancer (2, 17–19, 47), we generated A3A and A3B single-gene knockouts (KO) in the immortalized human type II pneumocyte cell line stably expressing a 4-hydroxytamoxifen (4-OHT)-regulatable oncogenic RAS chimeric protein (ref. 48; hereafter referred to as TIIP; Supplementary Fig. S3A–S3F). We investigated whether A3A and A3B could contribute to replication stress and ensuing accumulation of underreplicated DNA. Consistent with this hypothesis, in unperturbed proliferating cells, TIIP A3B-KO cells showed a strong reduction of pRPA(S33) and pCHK1(S345) compared with TIIP wild-type (WT) cells, confirming previous observations (43–46), whereas this reduction was not observed in TIIP A3A-KO cells (Supplementary Fig. S3E). Furthermore, TIIP A3B-KO cells showed higher fork extension rates, suggesting that A3B might promote fork slowing, exacerbating replication stress (Fig. 3A and B; two-tailed Mann–Whitney test, P ≤ 0.0001). Concordant with these data, we detected an increased fork extension rate also in the LUSC cell line H520, using siRNA-mediated knockdown of A3B (Fig. 3C; two-tailed Mann–Whitney test, P ≤ 0.01; Supplementary Fig. S3G) and a reduction in fork extension rate upon A3B overexpression in HEK293 cells (Fig 3D; two-tailed Mann–Whitney test, P ≤ 0.0001; Supplementary Fig. S3H and S3I). In addition, we found that in the presence of mild replication stress (0.2 μmol/L aphidicolin, 24 hours), TIIP A3B-KO cells accumulated fewer FANCD2 foci in prometaphase than TIIP WT or TIIP A3A-KO cells (Fig. 3E and F; two-tailed Mann–Whitney test, P ≤ 0.001; ref. 49). In contrast, A3B overexpression increased the number of FANCD2 foci in prometaphase (Supplementary Fig. S3J; two-tailed Mann–Whitney test, P ≤ 0.0001).
Consistent with the attenuation of DNA replication stress following A3B deletion, after treatment with 0.2 μmol/L aphidicolin, TIIP A3B-KO cells presented with fewer metaphase breaks at the FHIT common fragile site locus (Fig. 3G and H, two-tailed Fisher exact test, P ≤ 0.01), fewer FANCD2-flanked ultrafine bridges (UFB; Fig. 3I and J; two-tailed Mann–Whitney test, P ≤ 0.05) and fewer 53BP1 nuclear bodies in the G1 cell-cycle phase relative to TIIP WT cells (Fig. 3K and L; Supplementary Fig. S3K; two-tailed Mann–Whitney test, P ≤ 0.0001). After 24 hours of low-dose (0.2 μmol/L) aphidicolin treatment, the percentage of TIIP A3B-KO cells in the G1 cell-cycle phase was greater than that of TIIP WT cells (Supplementary Fig. S3L; two-tailed unpaired t test, P ≤ 0.05), suggesting cells might cope better without A3B-induced genome instability, allowing more cells to complete the cell cycle. Similarly, confirming the findings in TIIP cells, we observed that U2OS A3B-KO cells (Supplementary Fig. S3M–S3O) also had fewer 53BP1 nuclear bodies in the G1 cell-cycle phase relative to U2OS WT cells after 24-hour low-dose (0.2 μmol/L) aphidicolin treatment (Supplementary Fig. S3P; two-tailed Mann–Whitney test, P ≤ 0.0001). Because TIIP cells were immortalized and therefore do not properly senesce (see Methods), we also used the U2OS A3B-KO cells to investigate whether A3B-KO cells were still capable of senescing. Many U2OS A3B-KO cells harbored SA-β-gal–positive signals after 4 days HU treatment (0.2 mmol/L and 1 mmol/L), suggesting senescence can occur in the absence of A3B (Supplementary Fig. S3Q and S3R; two-tailed unpaired t test, P ≤ 0.05).
Analogous to aphidicolin, 4OHT-mediated RAS induction (3 days) resulted in fewer DNA damage–related foci in TIIP A3B-KO relative to TIIP WT cells (Supplementary Fig. S3S and S3T, two-tailed Mann–Whitney test). Taken together, these results suggest A3B exacerbates DNA replication stress, likely contributing to CIN.
A3B Exacerbates CIN and Promotes Aneuploidy
Because underreplicated regions of DNA contribute to CIN (49), we investigated whether A3B might contribute to CIN. We used the ImageStream cytometer together with centromere-specific FISH probes (hereafter ImageStream FISH) to quantify aneuploidy frequencies after 24 hours of low-dose (0.2 μmol/L) aphidicolin exposure. A centromeric chromosome 15 probe was used, overcoming the bias of the elevated missegregation observed with some larger chromosomes (50). A significant decrease in cells deviating from the modal chromosome 15 signal was detected in TIIP A3B-KO cells compared with TIIP WT cells (Fig. 4A; two-tailed unpaired t test, P ≤ 0.05). Furthermore, after 24 hours of low-dose (0.2 μmol/L) aphidicolin exposure, A3B-depleted cells had fewer micronuclei (Fig. 4B; two-tailed unpaired t test, P ≤ 0.05), in both RPE-1 cells through siRNA-mediated knockdown of A3B (Fig. 4C, two-tailed unpaired t test, P ≤ 0.05; Supplementary Fig. S4A) and in TIIP A3B-KO cells in the presence of 4OHT (3-day RAS; Supplementary Fig. S4B, two-tailed Mann–Whitney test, P ≤ 0.0001). In contrast, A3B overexpression increased the percentage of cells with chromosome missegregation and micronuclei (Fig. 4D and E, two-tailed unpaired t test, P ≤ 0.05).
To test whether A3B overexpression could contribute to CIN in vivo, we combined a Cre-inducible model for human A3B expression (Rosa26::LSL-A3B/LSL-A3B) with an EGFRL858R;Trp53flox/flox–driven lung cancer mouse model and a Cre-inducible tetracycline-controlled transactivator (R26LNL-tTA; see Methods; refs. 51, 52). Tumors were induced in the lungs of EGFRL858R;Trp53flox/flox (EP; N = 7) or EGFRL858R;Trp53flox/flox;R26LSL-A3B/LSL-tTA (EP-A3B; N = 8, 2 combined experiments; Fig. 4F). Lungs were harvested either at 3 months in 1 experiment or at termination in an independent experiment (between 110 and 207 days). As expected, lung cancers from EP-A3B mice stained positive for A3B in contrast to those from EP mice (Fig. 4G). There was significantly more pRPA(S4/S8) staining and foci (indicative of DNA damage) in lung cancers from EP-A3B mice relative to EP mice (Fig. 4G and I, two-tailed unpaired t test, P ≤ 0.01; P ≤ 0.05). We examined hematoxylin and eosin (H&E)–stained sections obtained from the mouse lung cancers for chromosome missegregation events (Fig. 4J). The EP lung cancer cells missegregated in 36%, whereas the EP-A3B lung cancer cells missegregated in 55% (Fig. 4K; Mann–Whitney test, P ≤ 0.001). These data suggest that A3B can contribute to CIN in developing lung cancers in vivo.
To assess the clinical relevance of our findings, we examined published gene-expression data sets of preinvasive lung and breast cancer in addition to the TRACERx lung study involving the first 100 patients (1). The enrichment of the CIN70 gene signature, a surrogate measure of CIN (53), increased during preinvasive LUSC evolution (Fig. 4L; two-tailed Mann–Whitney test, FDR ≤ 0.05). Unlike other APOBEC3 family members, A3B expression correlated significantly with the CIN70 enrichment score in almost all examined data sets (Fig. 4M; Spearman correlation, P ≤ 0.05, P ≤ 0.01, P ≤ 0.001). A3A, A3C, and A3F expression significantly positively correlated with the CIN70 gene signature in only a few data sets (Fig. 4M; Spearman correlation, P ≤ 0.05, P ≤ 0.01, P ≤ 0.001).
Additionally, we examined diagnostic H&E samples from the TRACERx 100 cohort (1) with microscopy for chromosome missegregation events (Supplementary Fig. S4C and S4D) and investigated potential associations with patient-matched APOBEC3 expression profiles. Only H&E sections with ≥10 anaphases were considered (9 diagnostic H&E sections had <10 anaphases and were not considered). Unlike any other APOBEC3 family members, only A3B expression significantly, albeit moderately, correlated with the percentage of anaphases with chromosome missegregation (Supplementary Fig. S4D; Spearman correlation, N = 58, ρ = 0.27; P = 0.038). Within the same cohort of patients, again only A3B expression was significantly associated with the proportion of the genome affected by somatic copy-number alterations (SCNA; Supplementary Fig. S4E; Spearman correlation, N = 58, ρ = 0.38; P = 0.0030). However, the percentage of APOBEC3-positive tumor cells did not correlate with the percentage of anaphases with chromosome missegregations (Supplementary Fig. S4F; Spearman correlation, N = 70, ρ = 0.047; P = 0.70) nor the proportion of the genome affected by SCNAs (both clonal and subclonal; Supplementary Fig. S4G; Spearman correlation, N = 72 (for 7 patients we could not derive SCNA measures), ρ = 0.11; P = 0.35). This might be explained by the antibody detecting both nuclear A3A and A3B, whereas RNA-seq enables reliable separation of A3A and A3B transcripts.
Finally, we examined the relationship between the burden of APOBEC3-mediated mutagenesis at a regional level within the TRACERx 100 cohort (1). Only tumors with a significant difference in APOBEC3 mutations between the 2 tumor regions have been considered in this analysis, with each comparison being confined to within-tumor regions (N = 14). We found that regions with higher numbers of APOBEC3 signature mutations had a significantly higher proportion of their genomes affected by SCNAs relative to tumor regions from the same patient with lower numbers of APOBEC3 signature mutations (Fig. 4N; two-tailed paired Wilcoxon test, N = 14, P = 0.042). Taken together, these data support the role of A3B induction in preinvasive disease, exacerbating replication stress and driving CIN (Fig. 4O).
Discussion
APOBEC3 expression is upregulated in a wide range of cancers including breast cancer and NSCLC (4, 5). A3A and A3B are two prominent candidates thought to be responsible for the APOBEC3-mediated mutational signature in numerous cancer types (2, 4, 5). Although recent next-generation sequencing efforts have detected APOBEC3 signature mutations within preinvasive LUAD (25–27), these studies have not analyzed the basis for APOBEC3 expression dynamics and the specific contribution of APOBEC3 to genome instability from preinvasive to invasive disease. Recent studies in breast cancer identified that A3B mRNA is upregulated at the DCIS stage (24, 54), but these studies did not investigate the entire repertoire of APOBEC3 genes. Here, we took an unbiased approach to investigate mRNA expression changes of all APOBEC3 genes during breast cancer and NSCLC evolution. Our IHC data confirmed overexpression of APOBEC3 in preinvasive lesions relative to normal lung epithelium and strongly suggest that APOBEC3 protein expression peaks during the preinvasive stage of NSCLC development. A part of the heterogeneity in APOBEC3 protein expression found in cancer could potentially be explained by SCNAs driving downstream changes in protein abundance (9) or associations with the cell cycle (55). The APOBEC3 nuclear staining likely originates from both A3B and A3A proteins; in contrast, the cytoplasmic signal has been reported to derive from A3A and A3G (30, 32). Furthermore, through sequencing of synchronous preinvasive and invasive lesions in two TRACERx patients, we observed both ubiquitous and private APOBEC3 signature mutations between the matched lesions. Additionally, both patients had a driver event in an APOBEC3 context that was present in the preinvasive and invasive lesions. Given our findings that APOBEC3 expression occurs early in preinvasive breast and NSCLC evolution and the role of APOBEC3 in catalyzing CIN, combined with our recent findings that tumor suppressor gene (TSG) losses are commonly early truncal events in tumor evolution (56), we suggest that APOBEC3 induction following DNA replication stress may drive the early onset of CIN, fueling clonal TSG copy-number loss events. In either scenario, our data suggest that APOBEC3 plays a role in driving genome instability and diversification in preinvasive disease, contributing to cancer evolution.
Previously, we have demonstrated that replication stress can drive APOBEC3 expression in breast cancer (20). We now demonstrate that APOBEC3-positive preinvasive lesions are enriched for a marker of senescence (SenTraGor) as well as a marker of replication stress [pRPA(S33)]. Our in vitro experiments suggest that replication stress drives APOBEC3 expression and senescence independently. Interestingly, U2OS A3B-KO cells had more SA-β-gal staining after prolonged replication fork arrest, suggesting A3B may support senescence bypass. At the very least, A3B does not appear to be required for senescence induction.
Consistent with a functional role for replication stress in driving A3B expression, CHK1 kinase inhibition reversed CDC6-induced A3B protein levels. Note that the profile of the induced APOBEC3 genes and the dynamics of APOBEC3 induction can differ between experimental conditions. Despite APOBEC3-positive cells having patches of SenTraGor staining, the majority of these cells were in fact SenTraGor-negative. A triple immunofluorescence staining for APOBEC3, Ki-67, and p21 in lung CIS samples showed that APOBEC3-positive cells were more often Ki-67–positive than p21-positive. A strong association between Ki-67 and A3B was recently also reported in HPV-negative oral epithelial dysplasias and head and neck cancers (57). Interestingly, here we found that APOBEC3 protein levels did not correlate with the number of APOBEC3 signature mutations. These data indicate that the sum of A3A and A3B protein levels may not fully reflect the history of APOBEC3-mediated mutagenesis. This is in line with observations that APOBEC3-mediated mutagenesis can occur in bursts (58). In summary, our IHC data together with in vitro experiments point toward replication stress and cell-cycle progression driving A3B expression. In contrast, the senescence program does not induce A3B expression and is a separate pathway downstream of replication stress (Fig. 4O).
Interestingly, our data support a model whereby A3B promotes the accumulation of underreplicated DNA through replication stress. It appears that A3B and likely other APOBEC3 genes as well, including A3A (59), exacerbate replication stress in the presence of mildly elevated levels of replication stress, hinting toward a positive feed-forward loop (Fig. 4O). Similar to our observations of A3B inducing replication stress, Mehta and colleagues have described that A3A can induce replication fork slowing. The initiating factor for this feed-forward loop might be caused by endogenous factors, such as oncogene-induced replication stress (60) or exogenous factors such as tobacco smoking (61). These observations are particularly intriguing in light of a recent study in yeast, showing that A3B causes 9- to 28-fold more mutations in engineered yeast strains with low levels of replicative DNA polymerases (62).
In conclusion, our in vitro and in vivo experimental data together with clinical evidence suggest that A3B induction early in tumorigenesis exacerbates both mutational and large-scale chromosomal copy-number diversity, providing a potent evolutionary fuel for selection and cancer adaptation.
Methods
Ethical Approval
Approval for study (1) was provided by the London Camden and Kings Cross Research Ethics Committee (13/LO/1546). Approval for study (29) was provided by the UCL/UCLH Local Ethics Committee (06/Q0505/12 and 01/0148). Approval for using tissue specimens from a Danish cohort of NSCLC was provided by the local ethical committee of Rigshospitalet, Copenhagen University Hospital and by the Danish Capital Region's Committee for Health Research Ethics (H-15008619).
All patients provided written informed consent and the studies were performed and all relevant ethical regulations were followed. These studies were conducted in accordance with recognized ethical guidelines in accordance with the Declaration of Helsinki.
Hybrid Histochemistry/IHC
Sections were stained from previously initiated cohorts of patients with NSCLC, namely from TRACERx (1), patients with preinvasive LUSC lesions undergoing surveillance (29), and a Danish cohort of NSCLC.
For APOBEC3 and pRPA(S33) staining of deparaffinized sections, we used our established sensitive IHC staining protocol including antigen unmasking in citrate buffer (pH 6) using microwave (15 minutes), followed by overnight incubation with the primary rabbit monoclonal anti-APOBEC3 antibody (5210-87-13; ref. 30) and the primary rabbit polyclonal anti-RPA2(S33) antibody (NB100-544; Novus Biologicals), and the ensuing processing by the indirect streptavidin–biotin–peroxidase method using the Vectastain Elite Kit (Vector Laboratories) and nickel sulfate–based chromogen enhancement detection as previously described, without nuclear counterstaining (63). The primary rabbit polyclonal anti-RPA2(S33) antibody was not reactive against murine tissue; therefore, the rabbit anti-RPA2 S4/S8 (NBP123017; Novus Biologicals) was used instead.
For senescence detection by SenTraGor, a GL13-based (a biotinylated Sudan Black-B chemical analogue that specifically reacts against lipofuscin) hybrid histochemical/IHC method was used, performed as previously published (39). Results were evaluated independently by two experienced oncopathologists, and scored on at least 400 cells per section, based on the percentage of cells with nuclear positivity of APOBEC3 or cytoplasmic positivity by the SenTraGor method, respectively. A similar quantitative method has been used and described in previous studies (20, 63). Representative images are shown in Figs. 1 (APOBEC3) and 2 [senescence, pRPA2(S33), APOBEC3], and scores were categorized as indicated in their respective legends.
The pRPA2(S4/S8) IHC images from the mouse lung cancers were analyzed using ImageJ. The signal was assessed on the representative images derived from six tumors without inducing the engineered human A3B and six tumors expressing A3B, compared with the negative control staining (the sections from tumors without induced A3B, stained with the anti-APOBEC3 antibody: see top left in Fig. 4G for an example of such negative control) as a background staining. The IHC signals were inverted, and the threshold was adjusted equally for each image (64). Both the total intensity of the signal was measured, and the number of foci was detected (particle size above 20 pixels).
BaseScope
BaseScope was performed according to the manufacturers' instructions (Biotechne). BaseScope probe Hs-APOBEC3A-2zz-st1 (701261) was used for detecting A3A transcripts, and probe Hs-APOBEC3B-1zz-st1-C2 (701271-C2) was used for detecting A3B transcripts.
Whole-Exome Sequencing of Patient-Matched Synchronous Preinvasive Lesions and Invasive NSCLC
Two patients, enrolled in the TRACERx study, presented with driver mutations in the APOBEC3 context that were both present in the preinvasive and invasive NSCLC lesions (1). Preinvasive and invasive lesions were considered to be clonally related if there were ≥20 mutations shared. Preinvasive lesions were identified from archival FFPE blocks. These two TRACERx cases were each diagnosed with synchronous CIS and LUSC. Laser capture microdissection was performed to sample preinvasive lesions, followed by DNA extraction using the GeneRead DNA FFPE Kit (Qiagen). A similar strategy for DNA quality control and whole-exome sequencing (WES) protocols was used as previously described (1). SBS mutational signatures were called in the preinvasive and invasive samples as described previously (1). To account for FFPE-induced artifacts, we removed C>T SBSs at CpG sites that had a variant count less than 10.
TRACERx RNA-seq
The RNA-seq data have been described elsewhere (65). Median values from the regional RNA-seq data were used per patient.
Cell Culture
The H2122, H1944, H520 lung cancer cell lines, and the U2OS and RPE-1 cell lines were obtained from Cell Services at The Francis Crick Institute, UK. H2122, H1944, and H520 cells were cultured in RPMI-1640 media (Thermo Fisher Scientific), supplemented with 10% fetal bovine serum and 1/10,000 units of penicillin–streptomycin (Sigma-Aldrich) and with L-glutamine (Thermo Fisher Scientific). RPE-1 and U2OS cells were cultured in DMEM (Thermo Fisher Scientific), supplemented with 10% fetal bovine serum and 1/10,000 units of penicillin–streptomycin (Sigma-Aldrich) and L-glutamine (Thermo Fisher Scientific). TIIP-ER-KRAS V12 (referred to as TIIP) cells were kindly provided by J. Downward (The Francis Crick Institute, London, UK; ref. 48). The process of TIIP cell derivation and immortalization has been described elsewhere (66). TIIP cells were cultured in DMEM/F-12 without phenol red (Thermo Fisher Scientific), supplemented with charcoal-stripped 10% fetal bovine serum and 1/10,000 units of penicillin–streptomycin (Sigma-Aldrich) and with L-glutamine (Thermo Fisher Scientific). The generation of HBEC CDC6 Tet-ON cells is described elsewhere (40). These cells were maintained in keratinocyte serum-free medium (#17005042, Thermo Fisher Scientific) supplemented with 25 mg bovine pituitary extract and 2.5 μg EGF. All the cell lines have been validated by short tandem repeat profiling and regularly tested for the presence of Mycoplasma. All cell culture experiments were performed within 15 passages from thawing.
Generation of Knockout Cells
A3A KO and A3B KO cells were generated using a similar strategy as described elsewhere (ref. 30; Supplementary Fig. S3A). The TIIP and U2OS bulk cell lines were single cell cloned to create a parental (i.e., WT) cell line (Supplementary Fig. S3B and S3M). The parental clone was transiently transfected with PX458(193–194) and PX458(199–200) and single cell cloned to create the TIIP A3A#4 KO cell line, or the parental clone was transiently transfected with PX458(197–198) and PX458(199–200) and single cell cloned to create the U2OS A3B#38, TIIP A3B#8, and TIIP A3B#13 KO cell lines. Confirmation of KO was performed through PCR and Western blotting (Supplementary Fig. S3C–S3E and S3N and S3O).
Treatments
For oncogene induction in the TIIP cell line, cells were incubated with 500 nmol/L 4-hydroxytamoxifen (4-OHT; Sigma-Aldrich), and this was replenished every day. For CDC6 induction in the HBEC-CDC6 cell line, cells were incubated in 1 μg/mL doxycycline, and this was replenished every second day. Senescence was induced with 8 Gy ionizing radiation or 1 mmol/L hydroxyurea. Low levels of replication stress were induced by exposing cells to 24 hours 0.2 μmol/L aphidicolin (Sigma-Aldrich) after which they were processed accordingly. CHK1 inhibition in HBEC-CDC6 cells was performed using 2 μmol/L LY2603618 (Selleckchem) for 24 hours prior to harvesting.
RNA Interference
Reverse transfections were performed with siRNA (Dharmacon, GE Healthcare) at a final concentration of 40 nmol/L using lipofectamine RNAiMax (Thermo Fisher Scientific). Nontargeting (NT) control siRNA was used as control. For specific APOBEC3B knockdown the following was used: ON-TARGETplus APOBEC3B siRNA targeted region (3′UTR, J-017322-08-0005).
Gel-Based Deamination Assay
The gel-based deamination assay was performed as described elsewhere (20). We used the following probe: 5′-(6-FAM)-AAAAAAAAATCGGGAAAAAAA-3′.
Immunofluorescence
Immunofluorescence was performed as described elsewhere (67). Primary antibodies are described in Supplementary Table S2. Secondary antibodies conjugated to Alexa Fluor 488 and Alexa Fluor 555 (1:2,000; Thermo Fisher) were used for detection. DNA was stained with DAPI (Thermo Fisher Scientific). Images of DNA fibers were acquired using a Zeiss AXIO Imager M2 microscope 40× 1.3 oil immersion objective (Zeiss) equipped with a Hamamatsu photonics camera. Images of anaphase cells and ultrafine bridges were acquired using an Olympus DeltaVision RT microscope (Applied Precision, LLC) with a 60× 1.3 oil immersion objective (Olympus) equipped with a Coolsnap HQ camera. Z stacks were acquired at 0.2 μm intervals at 12 μm thick sections. Images of interphase cells were acquired using a 40× 1.4 oil immersion objective (Olympus). Z stacks of interphase cells were acquired at 0.8-μm intervals at 12 μm thick sections. Deconvolution (Iterative restoration with 8 iterations) was performed using the softWoRx deconvolution tool.
Tissue sections were stained from 12 CIS lesions derived from patients with preinvasive LUSC lesions undergoing surveillance (29). The APOBEC3 and Ki-67 double immunofluorescence was performed on a diagnostic tissue microarray prepared from 79 NSCLC samples from the TRACERx lung study involving the first 100 patients (1). The immunofluorescence images were analyzed using Qpath software (68). Qpath was used to distinguish epithelial cells from stromal cells and manually reviewed thresholds were used to classify epithelial cells as positive or negative for the respective antibody staining.
Preparation of Metaphase Spreads and FISH
Cells were incubated with colcemid (Thermo Fisher Scientific) at 0.1 μg/mL for 3 hours, trypsinized, washed, and spun down. Cells were incubated at 37°C for 7 minutes in 2 mL prewarmed hypotonic solution (1:1, 0.4% KCl + 0.4% sodium citrate). Thereafter, 2 mL fixative was added (3:1, MeOH + acetic acid), and cells were spun. This process was repeated 3 times, after which cells were dropped. Samples were denatured, ethanol dehydrated, and subsequently incubated overnight at 37°C in a humidified chamber with FISH probe against the FHIT locus (FHIT-20-RE, Pisces Scientific). The next day, slides were plunged in wash buffer at room temperature and again at 65°C for 10 minutes, ethanol dehydrated, counterstained with DAPI, and finally mounted.
Mouse Strains and Tumor Induction
The Cre-inducible Rosa26::LSL-APOBEC3Bi mice are described elsewhere (refs. 51, 52; manuscripts under consideration). Mouse lung tumors were initiated in EGFRL858R;Trp53flox/flox (EP; N = 7) and EGFRL858R;Trp53flox/flox;R26LSL-A3B/LSL-tTA (EP-A3B; N = 8, 2 combined experiments) mice by intratracheal infection with adenoviral vectors expressing Cre recombinase (2.5 × 107 adenoviral particles per mouse). Adenoviral-Cre (Ad-Cre-GFP) was obtained from the University of Iowa Gene Transfer Core. All animal regulated procedures were approved by the Francis Crick Institute BRF Strategic Oversight Committee that incorporates the Animal Welfare and Ethical Review Body and conformed with the UK Home Office guidelines and regulations under the Animals (Scientific Procedures) Act 1986 including Amendment Regulations 2012.
Evaluation of Chromosome Missegregation Errors in H&E Samples
Diagnostic H&E samples from NSCLC samples in the TRACERx 100 cohort (1) were evaluated for anaphases with chromosome missegregation events using a 100× objective light microscope to evaluate cells undergoing anaphase. Only H&E sections with ≥10 anaphases were considered.
H&E samples from lung cancer samples from the EP (N = 7) and EP-A3B (N = 8) mouse models were evaluated for anaphases with chromosome missegregation events using a 100× objective light microscope. Depending on availability 1–2 sections per mouse were evaluated. For each mouse we observed at least 17 anaphases.
Immunoblot Assays
Immunoblotting was carried out according to standard procedures. Primary antibodies are described in Supplementary Table S3.
RNA Extraction and Quantitative Reverse Transcription PCR
RNA extraction was performed using a Qiagen RNeasy Kit and reverse transcription using an AffinityScript cDNA Synthesis Kit (Agilent Technologies) according to the manufacturers' instructions. Quantitative reverse transcription PCR was performed on a Quant-Studio 7 Flex Real-Time PCR System (Thermo Fisher Scientific). Previously described custom primers were used (ref. 69; Sigma-Aldrich) and are shown in Supplementary Table S4.
Flow Cytometry-Cell-Cycle Analysis
Cells were fixed in ice-cold 70% ethanol overnight at −20°C. Afterward, cells were washed with PBS and labeled for 30 minutes (RT) with 1 μg/mL of DAPI (Thermo Fisher Scientific) in PBS containing 50 μg/mL RNase A (Sigma-Aldrich). Cells were analyzed on a BD LSRFortessa X-20 (BD Biosciences) and acquired data were analyzed using the Cell-Cycle platform of FlowJo software.
DNA Fiber Stretching Assay
The DNA fiber stretching assay was performed as described previously (67). Twenty minutes of sequential labeling pulses of CIdU (red) and IdU (green) in TIIP, H520, and HEK293-A3B cell lines were subjected to DNA fiber stretching analysis.
EdU Incorporation Assay
The Click-iT Plus EdU Alexa Fluor 647 Flow Cytometry Assay Kit (Thermo Fisher Scientific) was used. Cells were supplemented with 10 μmol/L EdU for 30 minutes after which they were processed according to the manufacturer's instructions.
ImageStream FISH
Cells were exposed for 24 hours to vehicle or 0.2 μmol/L aphidicolin (Sigma-Aldrich), after which they were processed as described elsewhere (50). Briefly, 1.5 million cells were hybridized with chromosome 15 satellite enumeration probe (LPE015G, Cytocell) prior to analysis on an ImageStream X Mk II (Amnis).
Senescence-Associated β-Galactosidase Assay (Chromogenic Assay)
Senescence within cell culture was detected using the SA-β-gal assay (Sigma-Aldrich) as described elsewhere (70).
Recombinant DNA
Senescence reporter cell lines were created using the miR146a-EGFP plasmid (kind gift from S. Elledge, Harvard University; ref. 42). The pSpCas9(BB)-2A-GFP (PX458, Addgene plasmid #48138) plasmid was used for the generation of KO cell lines (kind gift from F. Zhang, MIT).
High-Throughput Imaging of DNA-Repair Foci and Micronuclei
Cells were seeded, treated, fixed, and stained in a black and clear bottom 96-well plate (Greiner μClear 655090). The antibody-stained cells were imaged on a PerkinElmer Opera Phenix using a water immersion 63x lens to capture confocal stacks of 7 planes. The images were projected and analyzed using the associated Phenix software, Harmony. For these high-throughput imaging experiments, all values above and below the median of the WT vehicle condition are shown as red and blue dots, respectively.
Analysis of Gene Expression in Cancer Progression Data Sets
Cross-platform gene-expression profiles from different stages of cancer development were compiled from Gene Expression Omnibus for breast tissue (GSE16873, N = 40 samples from 12 patients; GSE21422, N = 19 samples; GSE47462, N = 72 samples from 25 patients) and lung tissue (GSE52248, N = 18; Mascaux and colleagues, ref. 34/GSE33479, N = 122 samples from 77 patients) and Chen and colleagues (25). Differential gene expression in developmental stages relative to normal/healthy tissue was determined for each gene in the expression data set, using linear mixed-effects model with patient as random effect. Fixed effects were applied only for Mascaux and colleagues (GSE33479), as reported previously (34). P values were adjusted for multiple testing across all genes in each expression profile (FDR).
CIN (CIN70) Gene Signature Enrichment
We evaluated an enrichment score for a previously published set of genes associated with CIN, CIN70 (53). The method single-sample gene set enrichment analysis (ssGSEA; ref. 71) was applied on the gene-expression profile of each sample individually to calculate a normalized enrichment score (NES) and to determine whether the CIN70 gene signature was enriched among the upregulated (positive NES) or downregulated (negative NES) genes. The parameters sample and gene normalization were set to rank and z-score, respectively, and a NES was calculated for each log2-transformed gene-expression profile (GSE16873, GSE21422, GSE47462, GSE52248; Mascaux and colleagues, ref. 34; Biswas and colleagues, ref. 65; Chen and colleagues, ref. 25). A Spearman correlation test between CIN70 NES score and APOBEC3 gene expression was used to calculate the reported correlation coefficients and associated P values.
Determining the PFI HR
Data generated by the TCGA pilot project established by the NCI and the National Human Genome Research Institute were downloaded for survival analysis in patients with LUAD and LUSC. The data were retrieved through database of Genotypes and Phenotypes (dbGaP) authorization (accession no. phs000178.v9.p8). Information about TCGA and the investigators and institutions that constitute the TCGA research network can be found at https://cancergenome.nih.gov/.
In order to estimate the influence of expression of APOBEC3 genes on survival, a Cox proportional-hazards model with PFI as endpoint estimate was used. The definition of PFI has been used as described elsewhere (72). The expression of different APOBEC3 genes was measured as normalized expression counts. Before adding the different genes expressed as continuous predictor variables to the model, each variable has been z-transformed to make sure the predictor estimates are comparable between genes. The Cox model was calculated for early-stage tumor samples (stages I and IA) and later-stage tumor samples (stages IIIA, IIIB, and IV) separately.
Analysis of Genome Instability in APOBEC3 Heterogeneous Tumors
The multiregion WES data of the TRACERx100 cohort (1) were used to analyze differences in genome instability between tumor regions with a high and low number of APOBEC3 signature mutations within the same tumor. Only patients with a significant enrichment of the APOBEC3 mutation signature, as defined by Roberts and colleagues (4), have been considered in this analysis. The tumor regions with the highest and lowest ratio of APOBEC3 to non-APOBEC3 signature mutations have been identified for each tumor. Only patients with a significant difference in APOBEC3 mutation ratio between the high and low region have been considered for further analysis (N = 14). The proportion of the genome altered by SCNAs was calculated as the sum of the segment sizes with a copy-number gain or loss relative to the ploidy of the tumor region divided by the sum of all segments within the tumor region (1). A two-tailed paired Wilcoxon test was used to analyze the difference in the proportion of the genome altered between regions with high and low numbers of APOBEC3 signature mutations.
Data Availability
The TRACERx WES data generated, used, or analyzed during this study are not publicly available, and restrictions apply to the availability of these data. Such TRACERx WES data are available through the Cancer Research UK and University College London Cancer Trials Centre (ctc.tracerx@ucl.ac.uk) for academic noncommercial research purposes upon reasonable request, and subject to review of a project proposal that will be evaluated by a TRACERx data access committee, entering into an appropriate data access agreement and any applicable ethical approvals.
Authors' Disclosures
M. Angelova reports a patent for WO2020201362 issued. K. Evangelou reports grants from European Union grants, grants from National Public Investment Program of the Ministry of Development and Investment/General 27 Secretariat for Research and Technology, other support from Welfare Foundation for Social & Cultural Sciences, other support from H. Pappas, grants from Hellenic Foundation for Research and Innovation, and grants from NKUA-SARG during the conduct of the study. A. Pennycuick reports grants from Wellcome Trust during the conduct of the study; in addition, he has a patent for United Kingdom Patent Application No. 1819452.2 pending. S.J. Boulton reports personal fees from Artios Pharma Ltd during the conduct of the study. T.R. Fenton reports being on the Clinical and Scientific Advisory Board of APOBEC Discovery Ltd. E. Santoni-Rugiu reports personal fees from Pfizer, Takeda, Roche, and Bayer; also grants from Roche, and non-financial support from Takeda outside the submitted work. V.G. Gorgoulis reports grants from European Union, grants from National Public Investment Program of the Ministry of Development and Investment/General 27 Secretariat for Research and Technology, other support from Welfare Foundation for Social & Cultural Sciences, other support from H. Pappas, grants from Hellenic Foundation for Research and Innovation, and grants from NKUA-SARG during the conduct of the study. M. Jamal-Hanjani reports grants from Cancer Research UK during the conduct of the study. N. McGranahan reports personal fees from Achilles Therapeutics outside the submitted work; in addition, he has a patent for PCT/GB2018/052004 pending, a patent for PCT/EP2016/071471 pending, a patent for PCT/GB2018/052004 pending, and a patent for PCT/GB2020/050221 pending. R.S. Harris reports other support from ApoGen Biotechnologies during the conduct of the study; other support from ApoGen Biotechnologies outside the submitted work. S.M. Janes reports grants from Wellcome and grants from CRUK during the conduct of the study; personal fees from AstraZeneca, personal fees from Johnson & Johnson, personal fees from Bard1 Lifesciences, and grants from GRAIL Inc outside the submitted work. S.F. Bakhoum reports personal fees and other support from Volastra Therapeutics outside the submitted work; in addition, he has a patent for Targeting cGAS-STING signaling in cancer pending. C. Swanton reports grants from Pfizer, Boehringer-Ingelheim, and Archer Dx Inc; grants and personal fees from Bristol Myers Squibb, AstraZeneca, Roche-Ventana, Ono Pharmaceuticals; personal fees from GlaxoSmithKline, Novartis, Celgene, Illumina, MSD, Amgen, Sarah Canon Research Institute, Genentech, Bicycle Therapeutics, and Medicxi; personal fees and other support from GRAIL; other support from Epic Biosciences, Apogen Biotechnologies; and personal fees and other support from Achilles Therapeutics outside the submitted work; in addition, he has a patent for Immune checkpoint intervention in cancer (PCT/EP2016/071471) issued, a patent for Method for treating cancer based on identification of clonal neoantigens (PCT/EP2016/059401) issued, a patent for Methods for lung cancer detection (PCT/US2017/028013) issued, a patent for Method of detecting tumor recurrence (PCT/GB2017/053289) issued, a patent for Method for treating cancer (PCT/EP2016/059401) issued, a patent for Method of treating cancer by targeting insertion/deletion mutations (PCT/GB2018/051893) issued, a patent for Method of identifying insertion/deletion mutation targets (PCT/GB2018/051892) issued, a patent for Method for determining whether an HLA allele is lost in a tumor (PCT/GB2018/052004) issued, a patent for Method for identifying responders to cancer treatment (PCT/GB2018/051912) issued, and a patent for Method of predicting survival rates for cancer patients (PCT/GB2020/050221) issued; and he is Royal Society Napier Research Professor (RP150154). His work was supported by the Francis Crick Institute, which receives its core funding from Cancer Research UK (FC001169), the UK Medical Research Council (FC001169), and the Wellcome Trust (FC001169). C. Swanton is funded by Cancer Research UK (TRACERx, PEACE and CRUK Cancer Immunotherapy Catalyst Network), Cancer Research UK Lung Cancer Centre of Excellence, the Rosetrees Trust, Butterfield and Stoneygate Trusts, NovoNordisk Foundation (ID16584), Royal Society Research Professorship Enhancement Award (RP/EA/180007), the NIHR BRC at University College London Hospitals, the CRUK-UCL Centre, Experimental Cancer Medicine Centre and the Breast Cancer Research Foundation (BCRF). This research is supported by a Stand Up To Cancer-LUNGevity-American Lung Association Lung Cancer Interception Dream Team Translational Research Grant (SU2C-AACR-DT23-17). Stand Up To Cancer (SU2C) is a program of the Entertainment Industry Foundation. Research grants are administered by the American Association for Cancer Research, the Scientific Partner of SU2C. C. Swanton also receives funding from the European Research Council (ERC) under the European Union's Seventh Framework Programme (FP7/2007–2013) Consolidator Grant (FP7-THESEUS-617844), European Commission ITN (FP7-PloidyNet 607722), an ERC Advanced Grant (PROTEUS) from the European Research Council under the European Union's Horizon 2020 research and innovation programme (835297) and Chromavision from the European Union's Horizon 2020 research and innovation programme (665233). No disclosures were reported by the other authors.
Authors' Contributions
S. Venkatesan: Conceptualization, data curation, formal analysis, validation, investigation, visualization, methodology, writing–original draft, writing–review and editing. M. Angelova: Software, formal analysis, investigation, visualization, writing–original draft, writing–review and editing. C. Puttick: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. H. Zhai: Resources, data curation, formal analysis, validation, investigation, visualization, and methodology. D.R. Caswell: Investigation and methodology. W. Lu: Formal analysis, validation, investigation, visualization, and methodology. M. Dietzen: Formal analysis, investigation, visualization, and methodology. P. Galanos: Data curation, formal analysis, validation, investigation, and methodology. K. Evangelou: Investigation, and methodology. R. Bellelli: Formal analysis, investigation, methodology, writing–review and editing. E.L. Lim: Resources, data curation, formal analysis, investigation, methodology, and project administration. T.B. Watkins: Investigation and methodology. A. Rowan: Resources, formal analysis, investigation, methodology, project administration, writing–review and editing. V.H. Teixeira: Resources, formal analysis, investigation, visualization, and methodology. Y. Zhao: Resources, data curation, formal analysis, investigation, and methodology. H. Chen: Resources, data curation, formal analysis, investigation, and methodology. B. Ngo: Investigation and methodology. L. Zalmas: Resources, data curation, investigation, methodology, and project administration. M. Al Bakir: Resources, data curation, software, formal analysis, investigation, methodology, and project administration. S. Hobor: Resources, formal analysis, investigation, and methodology. E. Grönroos: Investigation and methodology. A. Pennycuick: Resources, data curation, formal analysis, investigation, methodology, and project administration. E. Nigro: Resources, data curation, formal analysis, investigation, methodology, and project administration. B.B. Campbell: Investigation. W.L. Brown: Resources, investigation, methodology, writing–original draft, writing–review and editing. A.U. Akarca: Resources, software, formal analysis, validation, investigation, visualization, methodology, writing–original draft, writing–review and editing. T. Marafioti: Resources, data curation, software, formal analysis, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. M.Y. Wu: Resources, data curation, software, formal analysis, validation, investigation, visualization, methodology, and project administration. M. Howell: Conceptualization, resources, data curation, software, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. S.J. Boulton: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. C. Bertoli: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, writing–review and editing. T.R. Fenton: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. R.A. de Bruin: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. A. Maya-Mendoza: Investigation and methodology. E. Santoni-Rugiu: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. R.E. Hynds: Resources, formal analysis, validation, investigation, methodology, writing–original draft, writing–review and editing. V.G. Gorgoulis: Resources, formal analysis, validation, investigation, visualization, methodology, writing–original draft, writing–review and editing. M. Jamal-Hanjani: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. N. McGranahan: Resources, data curation, software, formal analysis, validation, investigation, visualization, methodology, project administration. R.S. Harris: Resources, data curation, formal analysis, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. S.M. Janes: Conceptualization, resources, data curation, software, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. J. Bartkova: Conceptualization, resources, data curation, software, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. S.F. Bakhoum: Conceptualization, resources, data curation, software, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. J. Bartek: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. N. Kanu: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. C. Swanton: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. TRACERx Consortium: Resources and investigation.
Acknowledgments
We thank the members of the TRACERx consortium for participating in this study. The results published here are in part based upon data generated by TCGA pilot project established by the NCI and the National Human Genome Research Institute. R.E. Hynds is a Wellcome Trust Sir Henry Wellcome Fellow (WT209199/Z/17/Z) and received grant funding from the Roy Castle Lung Cancer Foundation that supported this work. P. Galanos is funded by KBVU grant R167-A11068. The work in the Boulton lab is supported by a European Research Council (ERC) Advanced Investigator Grant (TelMetab) and Wellcome Trust Senior Investigator and Collaborative Grants. T. Marafioti is supported by the UK National Institute of Health Research University College London Hospital Biomedical Research Centre and A.U. Akarca is supported by Cancer Research UK–UCL Centre Cancer Immuno-therapy Accelerator Award. R.A.M. de Bruin and C. Bertoli are supported by core funding to the MRC-UCL University Unit (Ref. MC_EX_G0800785) and funded by R.A.M. de Bruin's Cancer Research UK Programme Foundation Award. Cancer studies in the Harris lab are supported by NCI P01-CA234228. R.S. Harris is the Margaret Harvey Schering Land Grant Chair for Cancer Research, a Distinguished McKnight University Professor, and an Investigator of the Howard Hughes Medical Institute. K. Evangelou and V.G. Gorgoulis were financially supported by the European Union's Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grants agreement no. 722729 (SYNTRAIN); the National Public Investment Program of the Ministry of Development and Investment/General Secretariat for Research and Technology, in the framework of the Flagship Initiative to address SARS-CoV-2 (2020ΣE01300001); the Welfare Foundation for Social and Cultural Sciences (KIKPE), Greece; H. Pappas donation; grant no. 775 from the Hellenic Foundation for Research and Innovation (HFRI); and NKUA-SARG grants 70/3/9816, 70/3/12128, and 70/3/15603 S.F. Bakhoum is supported by the Office of the Director, the NIH under Award Number DP5OD026395 High-Risk High-Reward Program, the NCI Breast Cancer SPORE (P50CA247749) and R01 (R01CA256188-01), the Burroughs Wellcome Fund Career Award for Medical Scientists, the Parker Institute for Immunotherapy at MSKCC, the Josie Robertson Foundation, and the MSKCC core grant P30-CA008748. J. Bartek and his team were funded by grants from the Danish Cancer Society (R1123-A7785-15-S2 and R167-A11068), the Novo Nordisk Foundation (16854 and 0060590), the Lundbeck foundation (R266-2017-4289 and R322-2019-2577), the Swedish Research council (VR-MH 2014-46602-117891-30), The Swedish Cancer Foundation/Cancerfonden (170176), and the Danish national research foundation (project CARD, DNRF 125). N. Kanu receives funding from Cancer Research UK. C. Swanton is a Royal Society Napier Research Professor (RP150154). This work was supported by the Francis Crick Institute that receives its core funding from Cancer Research UK (FC001169), the UK Medical Research Council (FC001169), and the Wellcome Trust (FC001169). This research was funded in whole, or in part, by the Wellcome Trust (FC001169). For the purpose of open access, the authors have applied a CC by copyright license to any author-accepted manuscript version arising from this submission. This work was supported by Breast Cancer Research Foundation (BCRF), the European Research Council (ERC) under the European Union's Seventh Framework Programme (FP7/2007-2013) Consolidator Grant (FP7-THESEUS-617844), an ERC Advanced Grant (PROTEUS) from the European Research Council under the European Union's Horizon 2020 research and innovation programme (835297), Novo Nordisk Foundation (ID16584), and the National Institute for Health Research (NIHR) Biomedical Research Centre at University College London Hospitals. The authors thank Dr. Silvestro Conticello and Dr. Uday Munagala (ISPRO, Italy) for critical discussions. The authors thank Dr. Stephen Elledge (Harvard University) forproviding miR-146a-EGFP plasmid, and Dr. Julian Downward and Dr. Miriam Molina (The Francis Crick Institute, UK) for providing TIIP-ER-KRAS V12 cells. We also thank Dr. Sarah Clarke (University College London, UK) for their assistance with normal lung and CIS FFPE blocks. We also thank Laura Tovini and Dr. Sarah McClelland (Barts Cancer Institute, London) for their helpful advice on the ImageStream. We also thank Dr. Christoffer L. Halvorsen (Danish Cancer Society Research Center, Denmark), Dr. Robert Strauss (Danish Cancer Society Research Center, Denmark), Dr. Agostina Bertolin (The Francis Crick Institute, UK), and Dr. David Moore (The Francis Crick Institute, UK) for helpful suggestions. Finally, the authors gratefully acknowledge members of Experimental Histopathology, Light Microscopy, Flow Cytometry, Cell Services and Genomics Equipment Park at the Francis Crick Institute (The Francis Crick Institute, UK).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.