Mutagenic processes leave distinct signatures in cancer genomes. The mutational signatures attributed to APOBEC3 cytidine deaminases are pervasive in human cancers. However, data linking individual APOBEC3 proteins to cancer mutagenesis in vivo are limited. Here, we showed that transgenic expression of human APOBEC3G promotes mutagenesis, genomic instability, and kataegis, leading to shorter survival in a murine bladder cancer model. Acting as mutagenic fuel, APOBEC3G increased the clonal diversity of bladder cancer, driving divergent cancer evolution. Characterization of the single-base substitution signature induced by APOBEC3G in vivo established the induction of a mutational signature distinct from those caused by APOBEC3A and APOBEC3B. Analysis of thousands of human cancers revealed the contribution of APOBEC3G to the mutational profiles of multiple cancer types, including bladder cancer. Overall, this study dissects the mutagenic impact of APOBEC3G on the bladder cancer genome, identifying that it contributes to genomic instability, tumor mutational burden, copy-number loss events, and clonal diversity.

Significance:

APOBEC3G plays a role in cancer mutagenesis and clonal heterogeneity, which can potentially inform future therapeutic efforts that restrict tumor evolution.

See related commentary by Caswell and Swanton, p. 487

Cancer evolution, heterogeneity, and treatment resistance are often linked to the acquisition of somatic mutations and chromosomal instability through various mutagenic processes that leave distinct signatures within cancer genomes (1). Somatic mutational signatures enriched with cytidine to thymine or guanine substitutions in the TCW motifs (W: A or T) are prevalent in the genomes of several human cancers, particularly urothelial bladder cancer (1–3). These signatures have been associated with the apolipoprotein B mRNA-editing enzyme catalytic subunit 3 (APOBEC3) family (4, 5), which encodes seven APOBEC3 proteins (APOBEC3A, B, C, D, F, G, H) in humans (2). APOBEC3 proteins have overlapping biochemical activity (6), confounding cause and effect experiments aiming to dissect the mutagenic impact of individual APOBEC3 proteins in human cells. However, there are only limited experimental data linking individual APOBEC3 proteins to known mutational signatures in animal cancer models (7). Consequently, the role of individual APOBEC3 proteins in driving mutagenesis, cancer cell fitness, and clonal evolution in vivo remains unclear. These knowledge gaps have impeded efforts to therapeutically target mutagenic processes to alter the fitness of cancer cells (8–10).

APOBEC3G is a double deaminase domain protein that is ubiquitously expressed in normal and cancer cells (11). Its canonical function is to restrict retroviruses (12, 13), but its role in mutating tumor genomic DNA in vivo has been unknown prior to our study. Here, we show that transgenic human APOBEC3G significantly promotes genomic instability by increasing tumor mutational burden and copy-number alterations in a bladder cancer mouse model. These APOBEC3G-induced genomic lesions act as “mutagenic fuel” to drive clonal divergence and intratumor heterogeneity. We characterize a single-base substitution signature induced by APOBEC3G in vivo and describe its contribution to the mutational profiles of thousands of human cancers.

Animals

The mouse Apobec3 null mice (mA3−/−) and mice expressing the human APOBEC3G transgene on a null mouse Apobec3 (hA3G+mA3−/−) background were previously described (14, 15). A breeding scheme was used in which one parent carried the transgene (hA3G+mA3−/−), and the other parent did not (mA3−/−). This allowed us to use the mA3−/− littermates as the control group in our study. Mice were weaned at 3 weeks of age and group-housed. Genotyping of each animal was performed by Transnetyx using real-time PCR to detect the human transgenes and confirm the absence of mouse Apobec3 using primers as previously described (14, 15). The C57BL/6 male mice used as controls were purchased from The Jackson Laboratory. The animal experiments were carried out following the Weill Cornell Medical College Institutional Animal Care and Use Committee guideline (IACUC Protocol 2017-0048), which was approved by Weill Cornell Medicine.

Cell lines

The HEK293T and 5637 cell lines were purchased from ATCC (CRL-3216, HTB-9). The RT112/84 cell line was purchased from Millipore Sigma (#85061106). The HBLAK cell line was purchased from CELLnTEC advanced cell systems. The HEK293T cells were incubated in a humidified constant 37°C and 5% CO2 incubator in DMEM (Gibco) with 100 U/mL penicillin–streptomycin (Gibco) and 10% FBS (Omega Scientific). The 5637 and RT112/84 were incubated in a humidified constant 37°C and 5% CO2 incubator in RPMI-1640 (Gibco) and EMEM (ATCC) medium with 100 U/mL penicillin–streptomycin (Gibco) and 10% tetracycline-free FBS (Omega Scientific). The HBLAK cells were incubated in a humidified constant 37°C and 5% CO2 incubator in the Keratinocyte SFM medium (Gibco) with 100 U/mL penicillin–streptomycin (Gibco). The cells were tested for Mycoplasma contamination with the PCR detection kit (ABM) according to the manufacturer's instructions.

Study design

N-butyl-N-(4-hydroxybutyl)nitrosamine (BBN) was purchased as a single batch from TCI America (Batch ODW3F-FH). Red tinted water bottles (ANCARE) were used during drug administration since BBN is a light-sensitive compound. Ad libitum BBN was administered at a concentration of 0.05% in water and replenished twice weekly. Bottles were weighed to determine the volume drunk per cage during each period. The volume consumed by each animal was determined as the quotient of total water drunk per cage divided by the number of animals in that cage. Mice began BBN administration via drinking water at 8 to 10 weeks of age and continued for 12 weeks. After 12 weeks of ad libitum carcinogen administration, mice were returned to unaltered drinking water for 18 weeks to provide adequate time for APOBEC3G-induced mutagenesis. All experimenters were blinded to the mA3−/− and the hA3G+mA3−/− genotypes due to the mixed genotype populations in each cage. The C57BL/6J mice were treated in separate cages. Survival curves were calculated as the percentage of mice that expired prior to the 30-week time point.

RNA extraction and qPCR analysis

The bladder, lung, and spleen tissues were harvested from three hA3G+mA3−/− and three mA3−/− mice, respectively. Tissues were snap-frozen using liquid nitrogen. Tissues were homogenized using BioMasher II (TaKaRa) with RLT Plus buffer (1% β-mercaptoethanol). Total RNA was extracted using the RNeasy Plus Mini Kit protocol (Qiagen). The RNA concentration was determined by NanoDrop (Thermo Fisher Scientific). cDNA was synthesized using the SuperScript III First-Strand Synthesis System (Invitrogen). Real-time PCR was performed by LightCycler 480 (Roche) using Power SYBR Green Master Mix (Applied Biosystems). All reactions were run in triplicate and analyzed using the LightCycler 480 software. The APOBEC3G expression was normalized to GAPDH. The qPCR primers are listed in Supplementary Table S1.

SDS-PAGE Western blotting

Whole bladder protein lysate was obtained by homogenizing bladder tissue using Biomasher II (TaKaRa) in ice-cold RIPA buffer with the protease inhibitor (Thermo Fisher Scientific). Protein concentration was determined using a Pierce BCA Assay (Thermo Fisher Scientific). The lysate was run on a 10% SDS-PAGE gel with MOPS buffer and transferred using the iBlot system (Thermo Fisher Scientific). The anti-myc antibody (Abcam, ab9106, 1:1,000) and anti-α-tubulin (Millipore, 05-829, 1:1,000) were separately used as primary antibodies at 4°C overnight. HRP-conjugated secondary antibodies (Goat anti-Rabbit, 32260, and Goat anti-Mouse, 32230, Invitrogen, 1:1,000) were incubated for 1 hour at room temperature. Blots were developed using Luminata Forte Poly HRP Substrate (Millipore) reagent and imaged on the ChemiDoc imager (Bio-Rad). The protein lysate from the wild-type HEK293T cell and HEK293T transfected with the hAPOBEC3G-myc-T2A-eGFP vector using Lipofectamine2000 (Thermo Fisher Scientific) were used as negative and positive controls, respectively. This vector was synthesized by VectorBuilder, and its sequence was validated by Sanger sequencing.

Generation of doxycycline-inducible cell lines

The doxycycline-inducible GFP-tagged APOBEC3G vector and PiggyBac transposase vector were obtained as a gift from Dr. John Maciejowski Lab. The sequences were validated by Sanger sequencing. The human urothelial bladder tumor cells were transfected with the doxycycline-inducible GFP-tagged APOBEC3G vector and the PiggyBac transposase vectors with the Lipofectamine2000 transfection agent (Thermo Fisher Scientific). G418 was used for the selection of the clones with stable genomic insertion of the GFP-tagged APOBEC3G component. The serial dilution method was used to isolate single-cell clones in 96-well plates. The expression of APOBEC3G in each single-cell clone was validated by SDS-PAGE Western blot.

Nuclear and cytoplasmic extraction

The NE-PER nuclear and cytoplasmic extraction kit (Thermo Fisher Scientific) was used according to the manufacturer's instructions to isolate nuclear and cytoplasmic protein fractions from the cells after 2 days of doxycycline induction. The anti-GFP antibody (Abcam, ab13970,1:2,000), anti–α-tubulin (Millipore, 05-829,1:1,000), and anti-Lamin A/C (Cell Signaling Technology, 4777S,1:1,000) were used as primary antibodies for 2 hours at room temperature. The anti-myc antibody (Abcam, ab9106,1:1,000) and anti-APOBEC3G antibody (ARP-10082, 1:1,000) were used as the primary antibody at 4°C overnight. HRP-conjugated secondary antibodies (Goat anti-Rabbit, 32260, Rabbit anti-Chicken, 31401, and Goat anti-Mouse, 32230, Invitrogen, 1:1,000) were used for 1 hour at room temperature. The anti-APOBEC3G antibody (ARP-10082) was obtained through the NIH HIV Reagent Program. It is a polyclonal antiserum generated by Dr. Klaus Strebel against a synthetic peptide comprising the 17 C-terminal amino acids of human APOBEC3G (16, 17). This antibody was previously verified by expressing or knocking down APOBEC3G in A3.01 cells (17). Blots were developed using Luminata Forte Poly HRP Substrate (Millipore) reagent and imaged on the ChemiDoc imager (Bio-Rad).

Immunofluorescence microscopy

The cells were plated in a T25 flask and treated with vehicle (DMSO) or doxycycline for 48 hours. Cells were harvested, and nuclei were isolated using hypotonic buffer as previously described (18). The isolated nuclei were spun down onto the coverslip precultured with poly-L-ornithine (Sigma) at 400 g for 5 minutes, then fixed by 4% paraformaldehyde at room temperature for 10 minutes, following 3 times washing with PBS. After the slides were made, the immunofluorescence (IF) protocol was performed. The fixed nuclei were treated with 0.5% Triton X-100 in PBS and then blocked by 2.5% normal goat serum in PBS. The anti-α-tubulin antibody (Millipore, 05-829,1:1,000) was used as the primary antibody at 4°C overnight. Then, the AlexaFluor-594 goat anti-mouse antibody (Invitrogen, a11032, 1:1,000) was used as a secondary antibody for 2 hour at room temperature. After that, the slide was sealed with ProLong Gold Antifade Mountant with DAPI (Invitrogen, p36931). High-resolution imaging was performed using a DeltaVision Elite system, with a 60×/1.42 oil objective (Olympus) and EDGE sCMOs 5.5 camera. Twelve different fields of view for each group were randomly picked for imaging. Different channels were acquired for DAPI, FITC, and TRITC filters. Z-stacks of 0.2-μm increments for a total of 20 stacks (4 μm) were acquired per field of view. The captured images for the nuclei were adjusted and quantified by using ImageJ Fiji software as follows. The Z-stacks were summed to create a sum projection, and background subtraction was performed. Then the image was split into three channels, DAPI (blue), GFP (green), and α-Tubulin (red). The DAPI image was processed by Gaussian blur, autothreshold using the Otsu method, and then converted into a mask to create a list of regions of interest based on size and circularity. The manual correction was performed to ensure that only single nuclei were quantified. Next, the mean intensity value of the GFP (green) channel for regions of interest was measured. Multiple comparisons were performed for statistical analysis.

Confocal laser scanning fluorescent microscopy and 3D image reconstruction

A single isolated nucleus fluorescent image was captured using a Zeiss LSM880 inverted confocal microscope with a 63×/1.4 oil Dix M27 objective. The DAPI and FITC channels were scanned using an auto-optimized setup by the microscopy software ZEN Black 2.1 SP3 LSM. The image data were imported into the Imaris software (Oxford Instruments) for 3D reconstruction. The nuclear surface was reconstructed based on the DAPI staining. Then the GFP signal within the nuclear surface was isolated and reconstructed by default. The green color represents the GFP signal, and the violet represents the reconstructed nuclear surface.

Deamination assay

The purified APOBEC3G protein was obtained from the NIH AIDS program (ARP-10068). The oligos with targeted trinucleotide motifs and 5′ AlexaFluor-488 modification were generated by IDT. The sequences of oligos are listed in Supplementary Table S1. Oligo (0.5 μmol/L) was incubated at 37°C for 24 hours with 10 μmol/L purified APOBEC3G protein or empty as negative controls, 500 μg/mL RNase A, and 2 units UDG (NEB) in a volume of 20 μL HED buffer (25 mmol/L HEPES, 5 mmol/L EDTA, 10% glycerol, 1 mmol/L DTT and 1× protease inhibitor; Thermo Fisher). NaOH was then added to a final concentration of 100 mmol/L. Samples were incubated at 37°C for 30 minutes. After adding 2× loading buffer (95% formamide with 20 mmol/L EDTA), samples were incubated at 95°C for 3 minutes. Products were separated by 15% TBE-Urea PAGE (Thermo Fisher Scientific) and imaged on the Gel Doc EZ imager (Bio-Rad). The positive controls are the control oligos with U in the middle of each trinucleotide motif.

Excision and processing of bladder tumors

Mice were euthanized at the 30-week prespecified timepoint. The bladders were excised, weighed, and measured using a pathology ruler. Bladders were bisected along the sagittal plane, placed urothelium-side down in a cryomold, embedded in OTC (Tissue-Tek) using 2-methylbutane with dry ice, and sectioned with a cryostat. Cryomolds were created from the bladders of expired mice. Cryomolds were stored at −80°C. Frozen slides were generated and converted to FFPE slides for hematoxylin and eosin (H&E) staining at the Department of Pathology at Weill Cornell Medicine for histopathologic review. The H&E slides underwent histopathologic review by a fellowship-trained genitourinary pathologist (FK) to annotate cancer-involved regions and determine the tumors' extent. The pathologist was blinded to the genotype and experimental condition of the samples during the review.

Generation of murine tumor organoids

After excision of the bladder, a piece of tissue was obtained from the bladder tumor and placed in Advanced DMEM (Gibco) containing collagenase IV (100 U/mL; Gibco) and 10 μmol/L ROCK inhibitor Y-27632 (Selleckchem, S6390) at 37°C for 6 hours. After the cell solution was filtered by the cell strainer (FALCON), the tumor cells were harvested, resuspended in an organoid media with Matrigel (Corning), and plated as 3D droplets in a 6-well suspension plate (Cellstar). The organoid media composition was previously described (19). After sufficient growth in the 3D culture, the cells were passaged into the 2D culture in the same medium.

DNA extraction, library preparation, and whole-exome sequencing

Frozen tumor tissue was punched on dry ice using a 1-mm biopsy punch (Miltex) based on the region marked by the pathologist. The DNA from mouse bladder tumors and matched germline DNA from mouse tails were extracted using the DNeasy Blood Tissue kit (Qiagen) according to the manufacturer's instructions. DNA was stored at −80°C until library preparation. Quality control for DNA samples was performed using TapeStation (Agilent), and libraries for whole-exome sequencing were prepared with the Agilent SureSelect kit (SureSelect Mouse All Exon Kit) at the Weill Cornell Medical College genomics core facility. Pooled samples were loaded to NovaSeq6000 (Illumina) and sequenced (paired-end 2 × 100). The sequencing coverages of bladder tumors and matched germline tissues were generated using Picard before removing PCR-duplicate reads.

Alignment and somatic variant calling

Mouse reference genome GRCm38/mm10 was used for short reads alignment using our in-house alignment pipeline. Short reads were trimmed for adapter sequences, aligned by BWA MEM, indel realignment was performed via GATK, and deduplication of PCR duplicates was performed by Picard Tools. Somatic single-nucleotide variants (SNV) were called using The Cancer Genome Atlas (TCGA) MC3 Variant Calling Strategy, which merges the seven proven-performed variant calling methods, including Strelka2, MuSE, MuTect2, Pindel, RADIA, SomaticSniper, and VarScan, where MuTect2 and Strelka2 replaced MuTect and Strelka (20). Default parameters were used for these seven tools, except for instances identified by the original MC3, where non-default settings yielded optimal performance. MuSE was implemented with the germline resource. Pindel was implemented with a blacklist reference made available through the ENCODE Blacklist resource. Post hoc filtering of the variants identified by each caller was done identically to the MC3 pipeline except as follows. For MuSE, variant calls retained at least Tier 4-level significance, which corresponds to the calls with an added false-positive and false-negative probability of less than 1%. For Strelka2, the variant calls were filtered to those that passed the tool's internal significance testing. For Mutect2, the variant calls were filtered in two steps, with the first entailing filtering based on the significance scoring of each call and the estimation of sample contamination. These methods were implemented via the CalculateContamination and FilterMutectCalls method in GATK4. The second step consisted of quantifying nucleotide substitution errors caused by mismatched base pairings during various sample/library preparation stages. Such errors include artifacts introduced before the addition of adapters and those introduced after target selection. These sequencing artifacts were then filtered from the original Mutect2 calls using the CollectSequencingArtifactMetrics and FilterByOrientationBias methods of GATK4. All filtered variant calls from each tool were sorted using vcf-sort from the vcftools package and standardized as in the MC3 pipeline. After standardization, the calls from multiple tools were merged into a single file. The merged variant calls were further filtered for sequence artifacts described above and calls from regions in the capture region designed by SureSelect. Post-filters included the variants from multiple callers: (i) At least two callers should call the variants. (ii) Only the SNVs were included for further analysis. (iii) The variants were filtered by mouse dbSNP (build 146 for mouse mm10 assembly). (iv) The total read counts for germline ≥20 and tumor ≥40. (v) Alt read counts in germline <5 and the ratio of tumor variant allele frequency (VAF)/germline VAF ≥5. (vi) Alt read counts in tumor >10. Total mutations per Mb were calculated as the total number of somatic variants, including missense and nonsense variants in the coding regions after annotation, divided by 40.6 Mb pair, the protein-coding regions size.

Annotation of somatic calls

Somatic calls from the pipeline above were annotated using VEP version 95 and converted to MAF (Mutation Annotation Format) format using the vcf2maf.pl script (https://github.com/mskcc/vcf2maf).

Copy-number variant analysis

The copy-number variants (CNV) were evaluated using CNVkit (21). The CNV results were visualized as the heat map of binned log2 ratios using the ComplexHeatmap package (22) and circos plots using the RCircos package (23). The cytoband information and known gene definitions of mm10 assembly were extracted using the UCSC.Mouse.GRCm38.CytoBandIdeogram, TxDb.Musculus.UCSC.mm10.knownGene, and org.Mm.eg.db packages. The log2 value < −0.2 was considered copy-number losses, and the log2 value > 0.2 was considered copy-number gains. The length of segments involved in CNV was then calculated.

Clonal diversity analysis

Clone numbers and distributions were analyzed using the Pyclone-vi computational tool (24) and the EXPANDS tool (25, 26) using SNVs and CNVs data as inputs. The data inputs for EXPANDS were generated by CNVkit. The data inputs for Pyclone-vi were generated by the TITANCNA package (27). True diversities (Hill numbers, H) were calculated as |${}_{}^qD\ = \ {( {\mathop \sum_{i\ = \ 1}^R p_i^q} )}^{1/( {1 - q} )}$|⁠. The Shannon entropy (H') was calculated as |${\rm{In}}( {{}_{}^1D} )$|⁠. Diversity profiles for the hA3G+mA3−/− and the mA3−/− tumors were generated by graphing H(p,q) against q, where q is the index's sensitivity parameter q. Here, 1D is the Hill number of order 1. Differences between the medians of Hill numbers of the hA3G+mA3−/− and the mA3−/− tumors were separately analyzed using bootstrap-coupled estimation statistics and visualized as Gardner–Altman plots (28). The bias-corrected and accelerated 90% CI for the differences in medians were calculated from 5,000 bootstrapped samples. The P value reported is the likelihood(s) of observing the effect size(s) if the null hypothesis of zero difference is true. Phylogenetic trees were generated by EXPANDS and visualized using the ggtree package (29).

Kataegis analysis

Clustered mutational events were identified using two approaches. First, we defined kataegic clusters by an intermutational distance (<1 Kb) and mutation number (≥4) within a cluster. The distance and the genomic location of somatic mutations were generated using MutationalPatterns (30). The second method, SeqKat, identifies the clustered mutational events based on the binomial test (31). APOBEC-related kataegic events were defined as mutational clusters harboring enrichment of C>T mutations in cis based on the binomial test (at least 2 C>T mutations in cis in 4 to 5 clustered mutations or at least 3 C>T mutations in cis in 6 to 10 clustered mutations). Fisher exact test was used to compare the number of APOBEC-related kataegis within the total kataegic clusters in the hA3G+mA3−/− and the mA3−/− tumors and to compare the enrichment of C>T substitutions in cis compared with other C>T substitutions (singlet or clustering in trans) within kataegic loci between the hA3G+mA3−/− and the mA3−/− tumors.

Transcriptional and replicative strand asymmetry of APOBEC3-induced mutations

The transcriptional strand annotation for each mutation was generated using the mut_strand function in MutationalPatterns (30). The ratio of the total count of C>T mutations in the transcribed strand/the total count of C>T mutations in the un-transcribed strand was calculated for each tumor. The ratios of the hA3G+mA3−/− tumors were standardized to the mean ratio of the mA3−/− tumors. For the replicative strand bias analysis, the reference strand annotations were obtained from Riva and colleagues (32). Then, the replicative strand information for each mutation was generated using the mut_matrix_stranded replication model in MutationalPatterns (30). The replicative strand ratios of C>T mutations were then calculated for the hA3G+mA3−/− tumors, then standardized to the mean ratio of the mA3−/− tumors.

Principal component analysis

The trinucleotide mutational context for each sample was generated by MutationalPatterns (30). Based on the trinucleotide context for each sample, the principal component analysis (PCA) analysis was performed by the factoextra package. The confidence ellipse for each group represents the 95% confidence interval. The permutational multivariate analysis of variance test was used to calculate the P value via the vegan package.

Mutational signature analysis

We used a bootstrap resampling method (8, 9) to extract the net APOBEC3G-induced mutational signature. Apart from the transgenic expression of human APOBEC3G, the backgrounds of the two strains of mice are genetically identical. Compared with the mA3−/− mice, we hypothesized that the hA3G+mA3−/− tumors harbor human APOBEC3G-induced mutational signatures, which increased the total mutational burden. To detect the mutational shift distance (MSD) between the hA3G+mA3−/− and the mA3−/− tumors, we used bootstrap resampling as previously described (8). The normalized distance was calculated between the centroids of resamples and the original samples. The centroids were determined based on the mean value of the percentage of substitutions in the 96 channel matrices. The distribution of the normalized distance was obtained by repeating the previous step 10,000 times. The threshold distance for each group corresponding to a P value (P = 0.05) was calculated. A significant mutational spectrum shift between the hA3G+mA3−/− and the mA3−/− tumors was determined to be present if it crossed the threshold of the distances of two resampling groups (9). To detect significantly increased substitutions (P < 0.1) and extract the distinct APOBEC3G signature, the centroid of bootstrapped the hA3G+mA3−/− tumors was compared with the centroid of the mA3−/− tumors. For significantly increased substitution types, the magnitude of the increased counts was calculated as the difference between the centroids multiplied by the mean counts of resampled tumors. This step was then repeated 10,000 times. The significantly increased counts in each trinucleotide context were averaged to construct the APOBEC3G signature de novo as previously described (8). To validate the signature we extracted, we use two additional methods for signature extraction using the SigneR (33) tool, which provides a full Bayesian treatment to the nonnegative matrix factorization (NMF) model, and the HDP package (32), which uses the hierarchical Dirichlet process to de novo extract mutational signatures. For each approach, the trinucleotide mutational matrices of the mA3−/− and the hA3G+mA3−/− tumors were separately imported to de novo extract the signatures. After extraction, the cosine similarity was calculated between signatures from the mA3−/− and the hA3G+mA3−/− tumors by MutationalPatterns (30). Then, the unique signature from the hA3G+mA3−/− tumors, which had low cosine similarity to the signatures extracted from the mA3−/− tumors, was considered the signature induced by APOBEC3G (SBS.A3G).

Mutational signature induced by transgenic APOBEC3A expression

The raw sequencing data of 5 tumors from the transgenic APCmin C57BL/6J mice expressing human APOBEC3A and 4 tumors from the APCmin C57BL/6J mice were downloaded from the SRA database (BioProject ID: PRJNA655491; ref. 7). We reanalyzed the data and performed mutational calling using our MC3 pipeline with the post-filters as below: (i) At least two callers should call the variants. (ii) Only the SNVs were included for further analysis. (iii) The total tumor read counts ≥20. (iv) The normal VAF ≤ 0.01. (v) The tumor VAF > 0.02. The trinucleotide mutational context for each sample was generated by MutationalPatterns (30). The mutational signature SBS.A3A was extracted using a statistical framework based on the MSD method (8).

Cosine similarity to COSMIC signatures

The cosine similarity between the experimentally extracted signatures and Catalogue of Somatic Mutations in Cancer (COSMIC) signatures (sigProfiler_exome_SBS_signatures) was computed using MutationalPatterns (30). The signatures extracted from the whole-exome sequencing of murine tumors were normalized for the trinucleotide frequency in the human exome. The trinucleotide count in the mouse SureSelect whole-exome sequencing region was calculated using the triCount.R function in the mutationalProfiles package. The trinucleotide count of the human GRCh37/hg19 whole exome was obtained from the deconstructSigs package. The adjusted SBS.A3G was then used to calculate the cosine similarity to COSMIC exome signatures V3.

Fitting extracted signatures to human cancer

Three different pipelines were used to fit the experimental signatures to human cancer. (i) MutationalPatterns (30), which is a backward method to find a nonnegative linear combination of mutational signatures to reconstruct the mutation matrix (fit_to_signatures_strict function). (ii) deconstructSigs (34), which uses a multiple linear regression model to reconstruct the mutational profile of a single tumor sample. (iii) sigLASSO (35), which jointly optimizes the likelihood of sampling and signature fitting. Given the overlap between the UV signatures (SBS7a and SBS7b) and the experimentally validated SBS.A3G, we substituted SBS7a and SBS7b with SBS.A3G to the 63 other COSMIC V3 exon signatures to fit the signatures in patients’ tumors. The consistency of the mutational contribution from the respective signature among three fitting pipelines was measured by Pearson correlation analysis using the corrplot R package.

Analysis of APOBEC3G-preferred motifs in the HIV genome

The data were obtained from Dr. Linda Chelico (36). APOBEC3G-induced substitutions in the pol gene of HIV were measured in an isogenic HEK293T cell system expressing APOBEC3G (36). We reanalyzed the data and calculated substitutions based on the probability of the substitution at each position within the sequence context. First, we set up a filter for variant allele frequency of 0.01 (relevant alteration read count >100) to exclude background noise and call high-confidence substitutions at each position. Then, the counts for C>T substitutions in TC and CC motifs were calculated.

Statistical tests

One-tailed Mann–Whitney U test for continuous variables, one-tailed 2 × 2 table Chi-square test or Fisher exact test for categorical variables, nonparametric Spearman correlation, and log-rank test for survival analysis were performed using GraphPad Prism version 8.4.3 statistical analysis software. P < 0.05 was considered statistically significant.

Ethical approval

All animal experiments were carried out following Weill Cornell Medical College Institutional Animal Care and Use Committee guidelines (IACUC Protocol 2017-0048).

Data availability statement

The TCGA pan-cancer human cancer data are available at cBioPortal and dbGaP under the accession number PHS000178. The whole-exome sequencing data from murine bladder cancers are available in the SRA database (BioProject ID: PRJNA674775).

Code availability statement

The open source codes used in this paper listed were: BWA MEM (https://github.com/lh3/bwa), GATK (https://github.com/broadinstitute/gatk), Picard (https://github.com/broadinstitute/picard), Strelka2 (https://github.com/Illumina/strelka), MuSE (https://github.com/danielfan/MuSE), MuTect2 (https://github.com/broadinstitute/gatk), Pindel (https://github.com/genome/pindel), RADIA (https://github.com/aradenbaugh/radia), SomaticSniper (https://github.com/genome/somatic-sniper), VarScan (https://github.com/dkoboldt/varscan), VEP (https://github.com/Ensembl/ensembl-vep), vcf2maf (https://github.com/mskcc/vcf2maf), SeqKat (https://github.com/cran/SeqKat), CNVkit (https://github.com/etal/cnvkit), ComplexHeatmap package (https://github.com/jokergoo/ComplexHeatmap), RCircos package (https://github.com/cran/RCircos), Pyclone-vi (https://github.com/Roth-Lab/pyclone-vi), EXPANDS (https://github.com/noemiandor/expands), TitanCNA (https://github.com/gavinha/TitanCNA), ggtree (https://github.com/YuLab-SMU/ggtree), Gardner-Altman plots (https://github.com/ACCLAB/dabestr), MutationalPatterns (https://github.com/UMCUGenetics/MutationalPatterns), mutationProfiles (https://github.com/nriddiford/mutationProfiles), SigneR (https://github.com/rvalieris/signeR), HDP (https://github.com/nicolaroberts/hdp), deconstructSigs (https://github.com/raerose01/deconstructSigs), sigLASSO (https://github.com/gersteinlab/siglasso), factoextra (https://github.com/kassambara/factoextra), vegan (https://github.com/vegandevs/vegan), and Corrplot (https://github.com/taiyun/corrplot). The adapted code for the APOBEC3G signature is available at https://github.com/APOBEC3G.

APOBEC3G in murine bladder cancer

In contrast to humans, mice have a single Apobec3 (mA3) gene (6). Therefore, knocking out mouse Apobec3 provides a null background to study the effects of each of the seven individual transgenic human APOBEC3 proteins. We used mice that constitutively express human APOBEC3G (hA3G) transgene in the Apobec3-null background (hA3G+mA3−/−) to dissect the mutagenic role of APOBEC3G in vivo (Fig. 1A; Materials and Methods; refs. 14, 15). In the hA3G+mA3−/− mice, transgenic APOBEC3G is expressed in multiple tissues, including the urinary bladder, the lung, the spleen, and immune cells (14, 15). The APOBEC3G mRNA level in the urinary bladder was comparable with that in the lung and spleen (Fig. 1B). The protein expression of APOBEC3G in the urinary bladder was confirmed by SDS-PAGE Western blotting (Fig. 1C). We reasoned that APOBEC3-induced mutagenesis alone would not be sufficient for tumor initiation (7), so we used the chemical carcinogen N-butyl-N-(4-hydroxybutyl)nitrosamine (BBN) to initiate tumorigenesis. We then compared the survival outcomes between the hA3G+mA3−/− mice and their mA3−/− littermate controls (Fig. 1A). In the 44 male mice enrolled in the experiment, blinded histopathologic examination showed that 19 of 25 (76%) of the hA3G+mA3−/− mice and 10/19 (52.6%) of the mA3−/− mice developed bladder cancer (Fig. 1D and E). Strikingly, the hA3G+mA3−/− mice had significantly shorter survival compared with the mA3−/− mice (log-rank P = 0.007; Fig. 1F; Supplementary Table S2). We also included a cohort of the C57BL/6J mice that were exposed to BBN as controls. The C57BL/6J mice, which harbor mouse Apobec3, developed a higher percentage of advanced bladder cancers and had significantly lower survival than the mA3−/− mice (log-rank P = 0.0008) but phenocopied the hA3G+mA3−/− mice (Supplementary Fig. S1). All expired mice had bladder cancer.

Figure 1.

APOBEC3G contributes to carcinogenesis in a murine bladder cancer model. A, Experimental schema. The hA3G+mA3−/− mouse harbors an myc-tagged human APOBEC3G transgene under the control of a chicken β-actin promoter (CAG) on a mA3−/− background. The mA3−/− mouse was generated by knocking lacZ into the mouse Apobec3 locus between exon 4 and exon 5, resulting in the knockout of the mouse Apobec3 gene. B,APOBEC3G mRNA is expressed in different organs (lung, bladder, and spleen) from the hA3G+mA3−/− mice but not in those from the mA3−/− mice. Dots represent replicates from three different mice for each genotype. Horizontal lines indicate the mean value of mRNA expression. Error bars, mean ± SD. C, SDS-PAGE Western blot. APOBEC3G protein is expressed in the bladders of the hA3G+mA3−/− mice but not in the mA3−/− mice. Exogenous expression of myc-tagged APOBEC3G in HEK293T cells was used as a control. D, Representative H&E images (40× and 200×) of bladder tumors in the hA3G+mA3−/− and the mA3−/− mice. Scale bars, 200 μm in the 40× and 50 μm in 200× images, respectively. E, Composite bar chart representing the percentage of tumor stages in the hA3G+mA3−/− and the mA3−/− mice. The nontumor category includes benign tissue, hyperplasia, and dysplasia. F, The hA3G+mA3−/− (red) mice had lower survival than the mA3−/− (blue) mice. Log-rank test. All dead mice had pathologically confirmed bladder cancer. hA3G, human APOBEC3G; mA3, mouse Apobec3.

Figure 1.

APOBEC3G contributes to carcinogenesis in a murine bladder cancer model. A, Experimental schema. The hA3G+mA3−/− mouse harbors an myc-tagged human APOBEC3G transgene under the control of a chicken β-actin promoter (CAG) on a mA3−/− background. The mA3−/− mouse was generated by knocking lacZ into the mouse Apobec3 locus between exon 4 and exon 5, resulting in the knockout of the mouse Apobec3 gene. B,APOBEC3G mRNA is expressed in different organs (lung, bladder, and spleen) from the hA3G+mA3−/− mice but not in those from the mA3−/− mice. Dots represent replicates from three different mice for each genotype. Horizontal lines indicate the mean value of mRNA expression. Error bars, mean ± SD. C, SDS-PAGE Western blot. APOBEC3G protein is expressed in the bladders of the hA3G+mA3−/− mice but not in the mA3−/− mice. Exogenous expression of myc-tagged APOBEC3G in HEK293T cells was used as a control. D, Representative H&E images (40× and 200×) of bladder tumors in the hA3G+mA3−/− and the mA3−/− mice. Scale bars, 200 μm in the 40× and 50 μm in 200× images, respectively. E, Composite bar chart representing the percentage of tumor stages in the hA3G+mA3−/− and the mA3−/− mice. The nontumor category includes benign tissue, hyperplasia, and dysplasia. F, The hA3G+mA3−/− (red) mice had lower survival than the mA3−/− (blue) mice. Log-rank test. All dead mice had pathologically confirmed bladder cancer. hA3G, human APOBEC3G; mA3, mouse Apobec3.

Close modal

APOBEC3G drives genomic instability and clonal heterogeneity

To define the impact of human APOBEC3G on cancer mutagenesis, we performed deep whole-exome sequencing of somatic DNA from bladder cancers (mean, 348×) and matched germline DNA from the hA3G+mA3−/− and the mA3−/− mice. The hA3G+mA3−/− tumors harbored a significantly higher mutational burden (median 95.7 mutations per Mb) than the mA3−/− tumors (median 48.5 mutations per Mb, P = 0.04; Fig. 2A; Supplementary Table S3). Because nuclear access is a prerequisite for APOBEC3G-induced mutagenesis, we examined the nuclear localization of APOBEC3G in a bladder cancer organoid established from a hA3G+mA3−/− tumor. SDS-PAGE Western blotting showed that APOBEC3G was consistently present in the nuclear fraction (Supplementary Fig. S2A). The presence of APOBEC3G in the nuclear fraction was also confirmed in two human urothelial bladder cancer cell lines (5637 and RT112) using inducible GFP-tagged APOBEC3G (Supplementary Fig. S2B and S2C). Given that ectopic expression may cause supraphysiologic effects, we also examined the nuclear localization of endogenous APOBEC3G in the human bladder cancer cell lines RT112 and 5637 and the normal human bladder urothelial cell line HBLAK. APOBEC3G was present in the nuclear fraction of two cancer cell lines and was detected in small amounts in the nucleus of the normal urothelial cell line (Supplementary Fig. S2D). In addition, to avoid obscuring the nuclear signal by the surrounding cytoplasmic APOBEC3G, we performed fluorescent imaging of GFP-tagged APOBEC3G within extracted RT112 nuclei (Materials and Methods; Supplementary Fig. S3A). This confirmed the nuclear presence of APOBEC3G (Supplementary Fig. S3B). Finally, we performed fluorescent imaging and three-dimensional reconstruction of the extracted nuclei using confocal laser scanning microscopy (Supplementary Fig. S3C and S3D; Supplementary Video S1), demonstrating the presence of APOBEC3G in the nuclear compartment.

Figure 2.

APOBEC3G increases genomic instability and intratumoral heterogeneity. A, The hA3G+mA3−/− tumors harbor a higher mutational burden compared with the mA3−/− tumors. Mann–Whitney U test. *, P < 0.05. The data are shown in the box plot as the median with interquartile range (IQR). The lower whisker indicates Q1–1.5*IQR. The upper whisker indicates Q3+1.5*IQR. Each dot represents one tumor. B, Rainfall plot of kataegic loci in the hA3G+mA3−/− and the mA3−/− tumors. Vertical lines, individual tumors. Gray dots, singlet substitutions. Orange dots, substitutions within the APOBEC-unrelated kataegic loci. Purple dots, substitutions within the APOBEC-related kataegic loci with a significant number of C>T or G>A in kataegic loci calculated by the binomial test (Materials and Methods). C, Bar chart representing the number of kataegic loci, indicating that APOBEC-related kataegic loci were enriched in the hA3G+mA3−/− tumors. Twenty of 73 in the hA3G+mA3−/− tumors and 4 of 37 in the mA3−/− tumors. Fisher exact test. *, P < 0.05. D, Strand asymmetry of C>T substitutions in APOBEC-related kataegis. Each line represents APOBEC-related kataegic loci in different genotypes. The length of the line indicates the relative distance between substitutions. The triangles with different directions and colors indicate the strandedness of C>T substitutions. E, Bar chart representing the number of C>T substitutions occurring in cis and occurring in trans and singlets within all kataegic loci, indicating C>T substitutions occurring in cis were enriched in the hA3G+mA3−/− tumors. Forty-five of 81 in the hA3G+mA3−/− tumors and 9 of 40 in the mA3−/− tumors. Fisher exact test. *, P < 0.05. F, The hA3G+mA3−/− tumors had a broader mean length of CNV loss events compared with the mA3−/− tumors. Mann–Whitney U test. *, P < 0.05. The data are shown in the box plot as the median with interquartile range. The lower whisker indicates Q1–1.5*IQR. The upper whisker indicates Q3+1.5*IQR. Each dot represents one tumor. G, The hA3G+mA3−/− tumors harbor a higher number of clones compared with the mA3−/− tumors. Mann–Whitney U test. *, P < 0.05. Boxplots show the median and interquartile range. The lower whisker indicates Q1–1.5*IQR. The upper whisker indicates Q3+1.5*IQR. Individual dots indicate individual tumors. H, The hA3G+mA3−/− tumors displayed higher Shannon entropy compared with the mA3−/− tumors. Mann–Whitney U test. *, P < 0.05. Boxplots show the median and interquartile range. The lower whisker indicates Q1–1.5*IQR. The upper whisker indicates Q3+1.5*IQR. Individual dots indicate individual tumors. I, The median differences in Hill number orders between the hA3G+mA3−/− and the mA3−/− tumors are shown by the Gardner–Altman estimation plot. The mean difference is plotted as a bootstrap sampling distribution and is depicted as a dot with a 90% confidence interval, indicated by the ends of the vertical error bar. J, Clonal diversity bubble plot. Each circle represents a single subpopulation. The color scale indicates the size of the subpopulation in each tumor. K, Phylogenetic trees from one hA3G+mA3−/− and one mA3−/− tumor with comparable total mutational burdens but divergent clonal evolution patterns. WT, inferred baseline. Branch length corresponds to the proportion of the number of shared variants. The length of the branches in different tumors is normalized to the same scale. hA3G, human APOBEC3G; mA3, mouse Apobec3.

Figure 2.

APOBEC3G increases genomic instability and intratumoral heterogeneity. A, The hA3G+mA3−/− tumors harbor a higher mutational burden compared with the mA3−/− tumors. Mann–Whitney U test. *, P < 0.05. The data are shown in the box plot as the median with interquartile range (IQR). The lower whisker indicates Q1–1.5*IQR. The upper whisker indicates Q3+1.5*IQR. Each dot represents one tumor. B, Rainfall plot of kataegic loci in the hA3G+mA3−/− and the mA3−/− tumors. Vertical lines, individual tumors. Gray dots, singlet substitutions. Orange dots, substitutions within the APOBEC-unrelated kataegic loci. Purple dots, substitutions within the APOBEC-related kataegic loci with a significant number of C>T or G>A in kataegic loci calculated by the binomial test (Materials and Methods). C, Bar chart representing the number of kataegic loci, indicating that APOBEC-related kataegic loci were enriched in the hA3G+mA3−/− tumors. Twenty of 73 in the hA3G+mA3−/− tumors and 4 of 37 in the mA3−/− tumors. Fisher exact test. *, P < 0.05. D, Strand asymmetry of C>T substitutions in APOBEC-related kataegis. Each line represents APOBEC-related kataegic loci in different genotypes. The length of the line indicates the relative distance between substitutions. The triangles with different directions and colors indicate the strandedness of C>T substitutions. E, Bar chart representing the number of C>T substitutions occurring in cis and occurring in trans and singlets within all kataegic loci, indicating C>T substitutions occurring in cis were enriched in the hA3G+mA3−/− tumors. Forty-five of 81 in the hA3G+mA3−/− tumors and 9 of 40 in the mA3−/− tumors. Fisher exact test. *, P < 0.05. F, The hA3G+mA3−/− tumors had a broader mean length of CNV loss events compared with the mA3−/− tumors. Mann–Whitney U test. *, P < 0.05. The data are shown in the box plot as the median with interquartile range. The lower whisker indicates Q1–1.5*IQR. The upper whisker indicates Q3+1.5*IQR. Each dot represents one tumor. G, The hA3G+mA3−/− tumors harbor a higher number of clones compared with the mA3−/− tumors. Mann–Whitney U test. *, P < 0.05. Boxplots show the median and interquartile range. The lower whisker indicates Q1–1.5*IQR. The upper whisker indicates Q3+1.5*IQR. Individual dots indicate individual tumors. H, The hA3G+mA3−/− tumors displayed higher Shannon entropy compared with the mA3−/− tumors. Mann–Whitney U test. *, P < 0.05. Boxplots show the median and interquartile range. The lower whisker indicates Q1–1.5*IQR. The upper whisker indicates Q3+1.5*IQR. Individual dots indicate individual tumors. I, The median differences in Hill number orders between the hA3G+mA3−/− and the mA3−/− tumors are shown by the Gardner–Altman estimation plot. The mean difference is plotted as a bootstrap sampling distribution and is depicted as a dot with a 90% confidence interval, indicated by the ends of the vertical error bar. J, Clonal diversity bubble plot. Each circle represents a single subpopulation. The color scale indicates the size of the subpopulation in each tumor. K, Phylogenetic trees from one hA3G+mA3−/− and one mA3−/− tumor with comparable total mutational burdens but divergent clonal evolution patterns. WT, inferred baseline. Branch length corresponds to the proportion of the number of shared variants. The length of the branches in different tumors is normalized to the same scale. hA3G, human APOBEC3G; mA3, mouse Apobec3.

Close modal

We then investigated the presence of kataegis, a clustered mutagenesis process attributed to the processive deamination activity of APOBEC3 proteins (4, 37, 38), including APOBEC3G (4). We hypothesized that human APOBEC3G deaminates proximate cytosines to generate kataegic loci in mouse tumors. We identified these kataegic loci based on the intersubstitution distance and the number of substitutions involved in each locus and defined the APOBEC-related kataegic loci based on the strand coordination (Materials and Methods). We also used the computational tool SeqKat (31) to identify the kataegic loci (Supplementary Fig. S4). The hA3G+mA3−/− tumors harbored more APOBEC-related kataegic events and C>T mutations occurring in cis than the mA3−/− tumors (Fig. 2BE). Collectively, these results indicate that APOBEC3G increases kataegis in vivo. We also examined APOBEC3-induced copy-number changes using CNVkit (Materials and Methods; ref. 21). At the copy-number level, we found that the mean length of deleted segments was longer in the hA3G+mA3−/− tumors than in the mA3−/− tumors (log2 copy ratio < −0.2, P = 0.007; Fig. 2F; Supplementary Figs. S5 and S6).

As mutational and structural genomic instability is a driver of intratumoral heterogeneity and clonal evolution (39), we hypothesized that human APOBEC3G-induced mutations and CNV engender intratumoral heterogeneity in our murine bladder cancer model. To test this hypothesis, we quantified the number and size of cancer clones in each tumor using two computational methods EXPANDS (25, 26) and Pyclone-vi (Materials and Methods; ref. 24). The hA3G+mA3−/− tumors harbored a significantly higher clone number compared with the mA3−/− tumors (median 5 vs. 3 clones, respectively, P = 0.001) using Pyclone-vi (Fig. 2G). We then used Shannon entropy to represent the phylogenetic diversity, where a high entropy indicates high diversity (40). Shannon entropy was significantly higher in the hA3G+mA3−/− tumors compared with the mA3−/− tumors (median 1.3 vs. 0.8, respectively, P = 0.001), which indicated higher diversity in the hA3G+mA3−/− tumors (Fig. 2H). Next, we compared the diversity profiles based on Hill numbers (40), which include a sensitivity parameter that controls the weighing of common and rare clones in each tumor to ensure that rare clones do not skew the comparisons (Fig. 2I). These analyses confirmed that the hA3G+mA3−/− tumors harbor significantly higher clonal diversity compared with the mA3−/− tumors (Fig. 2H and I). These results were also confirmed using the EXPANDS computational tool (Fig. 2J and K; Supplementary Fig. S7; refs. 25, 26). In summary, our findings suggest that human APOBEC3G drives intratumoral clonal heterogeneity.

As the C57BL/6J mice with mouse Apobec3 phenocopied the hA3G+mA3−/− mice, we analyzed the mutational burden, kataegic events, and measures of intratumoral heterogeneity. The mutational burden, clonal number, and Shannon entropy of tumors from the C57BL/6J mice with mouse Apobec3 were significantly higher than those from the mA3−/− mice and comparable with those from the hA3G+mA3−/− mice (Supplementary Fig. S8). However, the number of APOBEC-related kataegic events and the number of C>T substitutions occurring in cis were not significantly different between tumors from the C57BL/6J and the mA3−/− mice (Supplementary Fig. S8B and S8C).

The mutational signature induced by APOBEC3G

The hA3G+mA3−/− tumors harbored a significantly higher number of C>T substitutions in the CC motif compared with the mA3−/− tumors (P = 0.002; Fig. 3A). This is consistent with the known CC motif preference of APOBEC3G-induced mutations in viral genomes (12, 41–43). To further characterize the APOBEC3G-induced mutational signature in vivo, we deconstructed the trinucleotide mutational spectra for mutations in each tumor (Supplementary Table S4). C>T substitutions in the TCC, GCC, CCC, CCT, and GCG motifs were significantly higher in the hA3G+mA3−/− tumors than in the mA3−/− tumors. In contrast, C>T substitutions in the ACA motif significantly decreased (Supplementary Fig. S9A). In addition, APOBEC3G preferred the transcribed and lagging strands (Supplementary Fig. S9B). The preference for the lagging strand is similar to previous reports of mutagenesis by other APOBEC3 family members, including APOBEC3A and APOBEC3B (44, 45).

Figure 3.

APOBEC3G generates a distinct in vivo mutational signature. A, The hA3G+mA3−/− tumors had higher C>T substitutions in the CC motif compared with the mA3−/− tumors. Mann–Whitney U test. *, P < 0.05. Boxplots show the median and interquartile range. The lower whisker indicates Q1–1.5*IQR. The upper whisker indicates Q3+1.5*IQR. Individual dots indicate individual tumors. B, PCA analysis based on the trinucleotide mutational spectrum showing divergence between the hA3G+mA3−/− and the mA3−/− tumors. The confidence ellipses represent the 95% confidence interval. Permutational multivariate analysis of variance test was used to calculate the P value. C, Significant mutational shift (black arrow) between the centroids of bootstrapped mutational spectra of the hA3G+mA3−/− and the mA3−/− tumors. The histogram represents the distribution of the difference between centroids of bootstrapped mutational spectra and original mutational spectra of the hA3G+mA3−/− (red) and the mA3−/− (blue) tumors. The dashed lines indicate the threshold of significance (P = 0.05) for each genotype. D, The single-base substitution (SBS) signature induced by transgenic expression of human APOBEC3G (SBS.A3G) and transgenic expression of human APOBEC3A (SBS.A3A). E, SBS.A3G has a low cosine similarity to SBS.A3A. F, Cosine similarity of experimentally derived and COSMIC Pan-Cancer Analysis of Whole Genomes (PCAWG) single-base substitution signatures. SBS.A3G has a low cosine similarity to SBS2 and SBS13. SBS.A3A had a high cosine similarity with SBS2. hA3G, human APOBEC3G; mA3, mouse Apobec3.

Figure 3.

APOBEC3G generates a distinct in vivo mutational signature. A, The hA3G+mA3−/− tumors had higher C>T substitutions in the CC motif compared with the mA3−/− tumors. Mann–Whitney U test. *, P < 0.05. Boxplots show the median and interquartile range. The lower whisker indicates Q1–1.5*IQR. The upper whisker indicates Q3+1.5*IQR. Individual dots indicate individual tumors. B, PCA analysis based on the trinucleotide mutational spectrum showing divergence between the hA3G+mA3−/− and the mA3−/− tumors. The confidence ellipses represent the 95% confidence interval. Permutational multivariate analysis of variance test was used to calculate the P value. C, Significant mutational shift (black arrow) between the centroids of bootstrapped mutational spectra of the hA3G+mA3−/− and the mA3−/− tumors. The histogram represents the distribution of the difference between centroids of bootstrapped mutational spectra and original mutational spectra of the hA3G+mA3−/− (red) and the mA3−/− (blue) tumors. The dashed lines indicate the threshold of significance (P = 0.05) for each genotype. D, The single-base substitution (SBS) signature induced by transgenic expression of human APOBEC3G (SBS.A3G) and transgenic expression of human APOBEC3A (SBS.A3A). E, SBS.A3G has a low cosine similarity to SBS.A3A. F, Cosine similarity of experimentally derived and COSMIC Pan-Cancer Analysis of Whole Genomes (PCAWG) single-base substitution signatures. SBS.A3G has a low cosine similarity to SBS2 and SBS13. SBS.A3A had a high cosine similarity with SBS2. hA3G, human APOBEC3G; mA3, mouse Apobec3.

Close modal

To characterize the differences in the mutational landscape between tumors from two genotypes, we performed a PCA analysis of the 96 trinucleotide mutational spectra. The mutational spectra of the hA3G+mA3−/− tumors showed significant divergence from those of the mA3−/− tumors (P = 0.03; Fig. 3B). To confirm these results, we adapted a statistical framework (8, 9) that uses bootstrap resampling of the 96 trinucleotide mutational profiles of sequenced tumors to extract the net mutational signature generated by human APOBEC3G in vivo (Materials and Methods). We found a significant mutational spectrum shift between tumors from the hA3G+mA3−/− and the mA3−/− groups (P = 0.05; Fig. 3C). We extracted the net signature of APOBEC3G (SBS.A3G) using the MSD method (Materials and Methods). SBS.A3G was characterized by C>T substitutions in the TCC, CCC, and CCT (Fig. 3D). This is consistent with the secondary motif preference of APOBEC3G for the TC context observed in the HIV genome (Supplementary Fig. S10; refs. 12, 36, 41). To confirm the SBS.A3G signature we identified, we used two additional orthogonal analytical methods for mutational signature extraction that rely on different mathematical procedures. The SigneR package (33) provides a full Bayesian treatment to the NMF model. The HDP package (32) utilizes the hierarchical Dirichlet process (Materials and Methods). Both methods identified a mutational signature that was predominantly composed of C>T mutations (SBS.A3G), was present in hA3G+mA3−/− tumors but absent from the mA3−/− tumors, and was distinct from other extracted signatures (Supplementary Fig. S11). The extracted signatures from three methods confirmed the key defining features of SBS.A3G, including the predominance of C>T mutations in the CCC, CCT, and TCC motifs (Supplementary Fig. S12A and S12B). To further confirm the capacity of APOBEC3G to deaminate cytidines in the CCC, CCT, and TCC motifs, we performed the cytidine deamination assay with purified APOBEC3G protein and oligonucleotides harboring these motifs. Our results confirmed that APOBEC3G has the capacity to directly deaminate cytidines in the three motifs we identified (Supplementary Fig. S13).

We then compared SBS.A3G to a previously described mutational signature induced by transgenic expression of human APOBEC3A in mice (SBS.A3A; Materials and Methods; ref. 7). SBS.A3G showed a low cosine similarity to SBS.A3A, which is characterized by predominant C>T substitutions in the TCA and TCT motifs (Fig. 3D and E). Furthermore, SBS.A3G showed a low cosine similarity to the mutational signature potentially induced by mouse Apobec3 in tumors from the C57BL/6J mice (SBS.mA3; Supplementary Fig. S14A and S14B). Finally, we characterized the relationship between the SBS.A3G signature and the previously curated COSMIC single-base substitution signatures derived from human cancers (Materials and Methods). We confirmed that SBS.A3G had a low cosine similarity to SBS2, and SBS.A3A, which are predominantly characterized by C>T substitutions in the TCW motifs (Fig. 3F; Supplementary Fig. S14C). We found no comparable COSMIC signatures to the mouse Apobec3 signature (SBS.mA3; Supplementary Fig. S14D). Together, these data suggest that APOBEC3G induces a distinct mutational signature from other APOBEC3 family members.

To identify the recurrently mutated genes by APOBEC3G, we evaluated putative APOBEC3G-induced mutations or other missense and nonsense mutations. Tumors from the hA3G+mA3−/− mice had a higher frequency of C>T substitutions in the CCC, CCT, and TCC motifs in known cancer genes (46), such as Kmt2d, Crebbp, and Kmt2a compared with tumors from the mA3−/− mice (Supplementary Fig. S15). Mutations in other genes, including the Csmd3, Cubn, and Herc1 genes, which are frequently mutated in human bladder cancer patients, were also more common in the hA3G+mA3−/− mice (Supplementary Fig. S15).

APOBEC3G contributes to mutagenesis in human cancers

We asked whether the SBS.A3G signature we identified contributes to the mutational profiles of human cancers. We reasoned that APOBEC3G expression is a prerequisite for APOBEC3G-induced mutagenesis. We examined the mRNA expression of APOBEC3G and the resulting mutational signatures in 8,292 tumors from 21 cancer types from TCGA pan-cancer cohorts (47). APOBEC3G mRNA was ubiquitously expressed in all tumor types [mean RNA-seq by expectation-maximization (RSEM) value: 368.7, mean expression ratio normalized to the mRNA expression of the housekeeping gene TATA-Box binding protein (TBP): 1.46; Fig. 4A]. Diffuse large B-cell lymphoma (DLBC), urothelial bladder carcinoma (BLCA), and renal clear cell carcinoma (KIRC) had the top normalized APOBEC3G to TBP expression ratios (4.9, 3.3, 3.1, respectively). To exclude the possibility that high APOBEC3 expression in bulk tumor RNA-seq originated from the infiltrating immune cells, we examined the expression level of APOBEC3G in cancer cell lines from the Cancer Cell Line Encyclopedia (48). We identified significant APOBEC3G mRNA expression in bladder cancer cell lines (Supplementary Fig. S16).

Figure 4.

The mutational impact of APOBEC3G in human cancers. A, mRNA expression level of APOBEC3G (normalized to TBP) in different cancer types in TCGA pan-cancer cohorts. The scale bar indicates the APOBEC3G mRNA level normalized to TBP. B, The contribution of APOBEC3G-induced mutagenesis to human cancers. The area of each circle represents the proportion of tumors with SBS.A3G contribution for each cancer type. The circle's color represents the median count of contributing counts of SBS.A3G in each cancer type. The proportion of tumors and the median contribution counts in each cohort were calculated after averaging the outcomes from fitting SBS.A3G extracted by different methods to each tumor using deconstructSigs. C, Patients in the SBS.A3G-predominant group (CCC, CCT, and TCC) had lower survival than patients in the TCW-predominant groups, irrespective of the total mutational burden in TCGA bladder cancer cohort. D, Overall survival analysis for the BLCA patients based on the C>T counts in respective motifs and the total mutational burden of the BLCA cohort. Log-rank test. A3G, APOBEC3G; TMB, tumor mutational burden. Abbreviations for TCGA cancer types are available at https://gdc.cancer.gov/resources-tcga-users/tcga-code-tables/tcga-study-abbreviations.

Figure 4.

The mutational impact of APOBEC3G in human cancers. A, mRNA expression level of APOBEC3G (normalized to TBP) in different cancer types in TCGA pan-cancer cohorts. The scale bar indicates the APOBEC3G mRNA level normalized to TBP. B, The contribution of APOBEC3G-induced mutagenesis to human cancers. The area of each circle represents the proportion of tumors with SBS.A3G contribution for each cancer type. The circle's color represents the median count of contributing counts of SBS.A3G in each cancer type. The proportion of tumors and the median contribution counts in each cohort were calculated after averaging the outcomes from fitting SBS.A3G extracted by different methods to each tumor using deconstructSigs. C, Patients in the SBS.A3G-predominant group (CCC, CCT, and TCC) had lower survival than patients in the TCW-predominant groups, irrespective of the total mutational burden in TCGA bladder cancer cohort. D, Overall survival analysis for the BLCA patients based on the C>T counts in respective motifs and the total mutational burden of the BLCA cohort. Log-rank test. A3G, APOBEC3G; TMB, tumor mutational burden. Abbreviations for TCGA cancer types are available at https://gdc.cancer.gov/resources-tcga-users/tcga-code-tables/tcga-study-abbreviations.

Close modal

To validate that SBS.A3G can be found in human tumors, we used deconstructSigs (DS; ref. 34) to estimate the number of mutations attributed to APOBEC3G in TCGA pan-cancer tumors (Materials and Methods). We found that in the urothelial BLCA cohort, 15% of tumors harbored evidence of SBS.A3G mutations (median: 12; range, 0–133; Fig. 4B). The sigLASSO (35) and MutationalPatterns (30) methods confirmed these results (Supplementary Fig. S17). We then measured the number of C>T substitutions in the CCC, CCT, and TCC peaks. The median C>T substitution counts in the CCC, CCT, and TCC motifs significantly correlated with the median APOBEC3G mRNA expression ratio (R = 0.49, P = 0.02; Supplementary Fig. S18). To understand the impact of APOBEC3G-induced mutations on clinical outcomes, we assigned urothelial bladder cancer patients in TCGA cohort to either the SBS.A3G-predominant (C>T in the CCC+CCT+TCC) or the TCW-predominant (C>T in the TCW, W: A or T) subgroups preferred by other APOBEC3 enzymes. Patients in the SBS.A3G-predominant subgroup had lower survival than patients in the TCW-predominant subgroup regardless of the total mutational burden (log-rank P = 0.0009; Fig. 4C and D). These data are consistent with the negative impact of APOBEC3G on survival we initially observed in our bladder cancer mouse model, suggesting that this phenotype extends to patients with bladder cancer.

Mutational signatures in human cancers have been associated with multiple mutagenic processes (1, 31, 49). However, human cancers are intrinsically noisy systems harboring a continuous interplay of several mutagenic and DNA-repair processes (10). Consequently, approaches relying on the analysis of sequencing data from human cancers, followed by heuristic attribution of mutational signatures to a particular mutagenic process, are inherently limited (8–10). These challenges have hindered our understanding of the specific contributions to mutagenesis of each of the seven APOBEC3 proteins, which are frequently coexpressed in human cancer cells (50). To dissect the mutagenic impact of a single member of the human APOBEC3 family in an experimentally controlled bladder cancer mouse model, we leveraged the differences between mouse and human APOBEC3 loci, which are separated by 76 million years of evolution (51). Here, we focused on APOBEC3G, a cytidine deaminase that restricts retroviruses by mutating viral genomes (12, 41–43). However, whether it can mutate genomic DNA in cancer cells has remained unknown until our study.

Using a model with transgenic expression of human APOBEC3G on a null mouse Apobec3 background, we provide cause-and-effect experimental evidence showing that human APOBEC3G directly contributes to mutating cancer genomes in vivo. Our data show that APOBEC3G induces genomic instability, resulting in an increase in tumor mutational burden, copy-number loss events, and clonal diversity. Clonal diversity is thought to arise from the acquisition of somatic mutations and chromosomal structural variants (39, 52), and high intratumoral heterogeneity was previously associated with poor cancer outcomes and the evolution of drug resistance (26, 53). We also discovered that APOBEC3G significantly increased kataegis in a pattern consistent with the known processive nature of its catalytic deamination effects (4, 37, 38). Kataegic events are enriched in genomic rearrangements (54, 55) and chromothripsis regions (38, 56) in several cancer types. These processes contribute to oncogene amplification or the inactivation of tumor-suppressing genes, which are important driver events in human cancers (56). Longitudinal computational analyses of the mutational profiles of serial tumors from individual patients show that mutational signatures attributed to APOBEC3 proteins (SBS2 and 13) are associated with subclone expansions in several cancer types, including urothelial bladder cancer (34, 57–59). Ongoing efforts to develop small-molecule inhibitors of APOBEC3 proteins to restrict cancer evolution require specific knowledge of the role of the individual APOBEC3 proteins responsible for driving intratumoral heterogeneity and clonal evolution (2).

The subcellular distribution of APOBEC3G is predominantly cytoplasmic (60). However, our data suggest that APOBEC3G is present in the nuclear compartment of bladder cancer cells. This is consistent with previous studies that found small amounts of APOBEC3G are present in the nuclei of human T lymphocyte cell lines (61, 62) and that APOBEC3G significantly contributes to an increase in DNA damage and genomic instability in multiple myeloma cell lines (63). Here, we adopted an experimental approach to establish the APOBEC3G-induced signature using three well-established orthogonal computational frameworks (8, 9, 32, 33). By comparing to the mA3−/− tumors, we were able to reduce background mutational signatures, including carcinogen-induced and other endogenous mutagen-induced signatures, as well as the signatures induced by other mouse Apobec/Aid family members. SBS.A3G had low cosine similarity to motifs attributed to other proteins in the APOBEC3 family. The differences in motif preferences and mutational patterns between APOBEC3G and APOBEC3A and APOBEC3B potentially explain the differences between our experimentally induced SBS.A3G and the COSMIC SBS2 and SBS13 signatures, which are classically associated with APOBEC3A and APOBEC3B (1, 3, 55).

The signature we identified (SBS.A3G) includes the C>T substitutions in the CC motif previously reported in the viral genome (12, 41) and substitutions in the TCC motif consistent with the previously identified secondary APOBEC3G's preference for the TC motif in in vitro assay (64) and the viral genome in cell lines (12, 41). Our findings suggest an overlap in the motifs preferred by APOBEC3G in the HIV and nuclear genomes. It is important to note that the mechanisms determining the net mutational pattern induced by APOBEC3G in the cancer genomes are not entirely dependent on the enzyme's motif binding preferences. The factors exclusive to the nuclear genome include the competition between APOBEC3G and single-strand DNA (ssDNA) binding proteins, which could sterically block APOBEC3G scanning ssDNA for the preferred motifs (65, 66), and secondary ssDNA structures in the nucleus, which can alter the frequency and hotspots of APOBEC3 deamination (67, 68). These factors collectively cooperate to determine the final signature imprinted by APOBEC3G on DNA in vivo. A recent study (69) showed residual APOBEC3-pattern mutagenesis even after APOBEC3A and APOBEC3B knockout in cancer cell lines. This is consistent with our data suggesting that APOBEC3G contributes to C>T mutations in the TC motifs in vivo.

In our examination of TCGA pan-cancer data set, we observed significant levels of APOBEC3G expression across several cancer types. However, determining a definitive causal relationship between the mRNA expression of individual APOBEC3 genes and their respective signatures in human cancer genomes is challenging for several reasons. First, mutational signatures can be imprinted on the genomes gradually over long-time intervals or in punctuated episodes. Therefore, the expression level of a given mutagenic protein at the time of sampling may not reflect its expression level at the time of mutagenesis in human cancers (70). Second, different APOBEC3 proteins may be expressed in the same cancer cell simultaneously, making it challenging to link specific mutational signatures to the expression of individual APOBEC3 proteins. Furthermore, using respective motifs, we found that patients with bladder cancer in the group with higher APOBEC3G-induced mutational signature had shorter overall survival than those enriched in signatures attributed to APOBEC3A and APOBEC3B, consistent with a previous report (71). These data suggest that individual APOBEC3 proteins impact clinical outcomes differently. It is important to note that in non–muscle–invasive bladder cancers, non–small cell lung cancers, and breast cancers, APOBEC3 mutations are collectively associated with poor clinical outcomes (72–74).

The C57BL/6J mice, similar to the hA3G+mA3−/− mice, had shorter survival than the mA3−/− mice. Further analysis identified that mouse Apobec3 was associated with increased tumor mutational burden and intratumoral clonal heterogeneity, which may contribute to tumorigenesis and worse survival. Despite possessing two conserved zinc-coordinating domains required for cytidine deamination (75), mouse Apobec3 was presumed to lack the potential to mutate genomic DNA owing to its predominant presence in the cytoplasm (76) and low capability of mutagenesis in normal cells (77). However, expressing mouse Apobec3 lacking exon 5, the major isoform expressed in the C57BL/6 mice (78), significantly increased the mutational burden in two breast cancer cell lines derived from different genetically engineered mouse models (79). These findings are consistent with our observations in the C57BL/6J mice that mouse Apobec3 can induce mutagenesis, especially in tumor cells. Despite the comparable mutagenic capability, mouse Apobec3 and APOBEC3G have alternative mechanisms for restricting retroviral replication (6, 14, 80, 81) and a reversed functional organization of the two cytidine deaminase domains (78, 82). These structural differences may account for the variation in the mutagenetic characteristics between mouse and human APOBEC3, leading to distinct mutational signatures.

Our study has potential limitations, one of which is the need for an initial carcinogenic event in our mouse model. We chose to use BBN, an alkylating nitrosamine, to recapitulate human urothelial bladder cancer, owing to its similarity to cigarette smoke carcinogens implicated in human urothelial bladder cancer (83). Although SBS.A3G is not similar to the previously described BBN signature, we cannot completely rule out that the observed SBS.A3G results from an interaction between BBN and APOBEC3G. The use of only male mice, due to the known preferential effect of BBN on male mice, is a limitation of our study (84). The sex-specific effects of APOBEC3G-induced mutagenesis in bladder cancer need further investigation. Furthermore, the mutational signature of APOBEC3G may not fully represent the corresponding mutational signature in humans. Similarly, C>G mutations (SBS13) induced by APOBEC3A were lost in the mouse genome but can be identified in other model systems (7). These differences may be attributable to a 15-fold increase in the efficiency of genomic uracil processing in human cells compared with mouse cells (85), which could affect the Rev1-mediated transversions of the deaminated cytidines (69, 86).

The interaction between APOBEC3 enzymes and the immune system warrants investigation in future studies. The mA3−/− mice display no significant differences in the abundance of lymphoid subsets (15, 87). The higher tumor mutational burden in the hA3G+mA3−/− tumors potentially translates into higher neoantigen loads that provoke an anticancer immune response and increased sensitivity to the immune-checkpoint blockade (88). Furthermore, APOBEC3G increased the mean length of deleted segments, indicating that APOBEC3G is potentially engaged in generating structural variants. Recently, chromosomal instability signatures were described in human cancers based on the lengths and segmentation of copy-number events (89). The impact of APOBEC3G on these chromosomal instability signatures warrants further investigation. Finally, the potential deamination-independent contributions of APOBEC3G to genomic instability require further investigation, given recent reports showing that other APOBEC3 enzymes can drive chromosomal instability in a deaminase-independent manner (90).

In summary, we show that APOBEC3G contributes to mutagenesis, kataegis, and intratumoral heterogeneity in cancer genomes. Our findings will potentially inform future therapeutic efforts to restrict tumor evolution by targeting specific APOBEC3 enzymes (2).

O. Elemento reports other support from Owkin, OneThree Bio, personal fees and other support from Volastra Therapeutics, Pionyr Immunotherapeutics, and Champions Oncology during the conduct of the study. B.M. Faltas reports grants from Starr Cancer Consortium grant (I12-0030) during the conduct of the study; personal fees from Astrin Biosciences, Natera, Guardant, Janssen, Merck, Immunomedics, Immunomedics/Gilead, QED Therapeutics, BostonGene, Urotoday, and Axiom Healthcare Strategies, and grants from Eli Lilly outside the submitted work. No disclosures were reported by the other authors.

W. Liu: Conceptualization, data curation, software, formal analysis, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. K.P. Newhall: Data curation, formal analysis, validation, investigation, visualization, methodology, writing–original draft, writing–review and editing. F. Khani: Data curation, formal analysis, validation, visualization, methodology, writing–original draft, writing–review and editing. L. Barlow: Resources, data curation, validation, writing–review and editing. D. Nguyen: Data curation, formal analysis, validation, visualization, methodology, writing–original draft, writing–review and editing. L. Gu: Data curation, formal analysis, investigation, visualization, methodology, writing–original draft, writing–review and editing. K. Eng: Data curation, software, formal analysis, validation, methodology, writing–original draft, writing–review and editing. B. Bhinder: Data curation, software, formal analysis, validation, visualization, methodology, writing–original draft, writing–review and editing. M. Uppal: Data curation, software, formal analysis, validation, methodology, writing–original draft, writing–review and editing. C. Récapet: Data curation, software, formal analysis, methodology, writing–review and editing. A. Sboner: Resources, software, formal analysis, methodology, writing–review and editing. S.R. Ross: Resources, writing–original draft, writing–review and editing. O. Elemento: Resources, software, formal analysis, writing–original draft, writing–review and editing. L. Chelico: Resources, data curation, formal analysis, validation, writing–original draft, writing–review and editing. B.M. Faltas: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing.

B.M. Faltas was supported by the Starr Cancer Consortium grant (I12-0030). The authors thank Dr. Jenny Xiang (Genomics Resources Core Facility) for whole-exome sequencing and Dr. Tuo Zhang for his input on bioinformatic analysis. They thank Dr. Edwin Sandanaraj for his input on bioinformatic analysis. The authors thank Bing He from the Translational Pathology Core lab at Weill Cornell Medicine. They thank Dr. John Maciejowski and Alexandra Dananberg for the kind gift of the doxycycline-inducible GFP-tagged APOBEC3G vector and PiggyBac transposase vector and their input and advice on the immunofluorescence microscopy (DeltaVision Elite system). The authors thank Dr. Zhengming Chen for his input on the statistical analyses. They thank Dr. Sushmita Mukherjee for her input and advice on confocal laser scanning microscopy, 3D reconstruction by Imaris, and the quantification of immunofluorescent images. The authors thank the Microscopy and Image Analysis Core Facility. Finally, they thank Dr. Nathaniel Landau for his input and advice. The following reagents were obtained through the NIH HIV Reagent Program, Division of AIDS, NIAID, NIH: polyclonal anti-human APOBEC3G (ApoC17) antiserum, rabbit, ARP-10082, contributed by Dr. Klaus Strebel. Human APOBEC3G protein with C-terminal histidine tag, recombinant from Baculovirus, ARP-10068, contributed by DAIDS/NIAID, produced by ImmunoDX, LLC.

The publication costs of this article were defrayed in part by the payment of publication fees. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).

1.
PCAWG Mutational Signatures Working Group, PCAWG Consortium
,
Alexandrov
LB
,
Kim
J
,
Haradhvala
NJ
,
Huang
MN
, et al
.
The repertoire of mutational signatures in human cancer
.
Nature
2020
;
578
:
94
101
.
2.
Swanton
C
,
McGranahan
N
,
Starrett
GJ
,
Harris
RS
.
APOBEC enzymes: mutagenic fuel for cancer evolution and heterogeneity
.
Cancer Discov
2015
;
5
:
704
12
.
3.
Roberts
SA
,
Lawrence
MS
,
Klimczak
LJ
,
Grimm
SA
,
Fargo
D
,
Stojanov
P
, et al
.
An APOBEC cytidine deaminase mutagenesis pattern is widespread in human cancers
.
Nat Genet
2013
;
45
:
970
6
.
4.
Taylor
BJ
,
Nik-Zainal
S
,
Wu
YL
,
Stebbings
LA
,
Raine
K
,
Campbell
PJ
, et al
.
DNA deaminases induce break-associated mutation showers with implication of APOBEC3B and 3A in breast cancer kataegis
.
eLife
2013
;
2
:
e00534
.
5.
Burns
MB
,
Lackey
L
,
Carpenter
MA
,
Rathore
A
,
Land
AM
,
Leonard
B
, et al
.
APOBEC3B is an enzymatic source of mutation in breast cancer
.
Nature
2013
;
494
:
366
70
.
6.
MacMillan
AL
,
Kohli
RM
,
Ross
SR
.
APOBEC3 inhibition of mouse mammary tumor virus infection: the role of cytidine deamination versus inhibition of reverse transcription
.
J Virol
2013
;
87
:
4808
17
.
7.
Law
EK
,
Levin-Klein
R
,
Jarvis
MC
,
Kim
H
,
Argyris
PP
,
Carpenter
MA
, et al
.
APOBEC3A catalyzes mutation and drives carcinogenesis in vivo
.
J Exp Med
2020
;
217
:
e20200261
.
8.
Kucab
JE
,
Zou
X
,
Morganella
S
,
Joel
M
,
Nanda
AS
,
Nagy
E
, et al
.
A compendium of mutational signatures of environmental agents
.
Cell
2019
;
177
:
821
36
.
9.
Zou
X
,
Owusu
M
,
Harris
R
,
Jackson
SP
,
Loizou
JI
,
Nik-Zainal
S
, et al
.
Validating the concept of mutational signatures with isogenic cell models
.
Nat Commun
2018
;
9
:
1744
.
10.
Koh
G
,
Zou
X
,
Nik-Zainal
S
.
Mutational signatures: experimental design and analytical framework
.
Genome Biol
2020
;
21
:
37
.
11.
Ng
JCF
,
Quist
J
,
Grigoriadis
A
,
Malim
MH
,
Fraternali
F
.
Pan-cancer transcriptomic analysis dissects immune and proliferative functions of APOBEC3 cytidine deaminases
.
Nucleic Acids Res
2019
;
47
:
1178
94
.
12.
Harris
RS
,
Bishop
KN
,
Sheehy
AM
,
Craig
HM
,
Petersen-Mahrt
SK
,
Watt
IN
, et al
.
DNA deamination mediates innate immunity to retroviral infection
.
Cell
2003
;
113
:
803
9
.
13.
Sheehy
AM
,
Gaddis
NC
,
Choi
JD
,
Malim
MH
.
Isolation of a human gene that inhibits HIV-1 infection and is suppressed by the viral Vif protein
.
Nature
2002
;
418
:
646
50
.
14.
Stavrou
S
,
Crawford
D
,
Blouch
K
,
Browne
EP
,
Kohli
RM
,
Ross
SR
, et al
.
Different modes of retrovirus restriction by human APOBEC3A and APOBEC3G in vivo
.
PLoS Pathog
2014
;
10
:
e1004145
.
15.
Okeoma
CM
,
Lovsin
N
,
Peterlin
BM
,
Ross
SR
.
APOBEC3 inhibits mouse mammary tumour virus replication in vivo
.
Nature
2007
;
445
:
927
30
.
16.
Opi
S
,
Kao
S
,
Goila-Gaur
R
,
Khan
MA
,
Miyagi
E
,
Takeuchi
H
, et al
.
Human immunodeficiency virus type 1 Vif inhibits packaging and antiviral activity of a degradation-resistant APOBEC3G variant
.
J Virol
2007
;
81
:
8236
46
.
17.
Miyagi
E
,
Kao
S
,
Fumitaka
M
,
Buckler-White
A
,
Plishka
R
,
Strebel
K
, et al
.
Long-term passage of Vif-null HIV-1 in CD4 + T cells expressing sub-lethal levels of APOBEC proteins fails to develop APOBEC resistance
.
Virology
2017
;
504
:
1
11
.
18.
Neely
AE
,
Bao
X
.
Nuclei isolation staining (NIS) method for imaging chromatin-associated proteins in difficult cell types
.
Curr Protoc Cell Biol
2019
;
84
:
e94
.
19.
Pauli
C
,
Hopkins
BD
,
Prandi
D
,
Shaw
R
,
Fedrizzi
T
,
Sboner
A
, et al
.
Personalized in vitro and in vivo cancer models to guide precision medicine
.
Cancer Discov
2017
;
7
:
462
77
.
20.
Ellrott
K
,
Bailey
MH
,
Saksena
G
,
Covington
KR
,
Kandoth
C
,
Stewart
C
, et al
.
Scalable open science approach for mutation calling of tumor exomes using multiple genomic pipelines
.
Cell Syst
2018
;
6
:
271
81
.
21.
Talevich
E
,
Shain
AH
,
Botton
T
,
Bastian
BC
.
CNVkit: genome-wide copy number detection and visualization from targeted DNA sequencing
.
PLoS Comput Biol
2016
;
12
:
e1004873
.
22.
Gu
Z
,
Eils
R
,
Schlesner
M
.
Complex heatmaps reveal patterns and correlations in multidimensional genomic data
.
Bioinformatics
2016
;
32
:
2847
9
.
23.
Zhang
H
,
Meltzer
P
,
Davis
S
.
RCircos: an R package for Circos 2D track plots
.
BMC Bioinf
2013
;
14
:
244
.
24.
Gillis
S
,
Roth
A
.
PyClone-VI: scalable inference of clonal population structures using whole genome data
.
BMC Bioinf
2020
;
21
:
571
.
25.
Andor
N
,
Harness
JV
,
Müller
S
,
Mewes
HW
,
Petritsch
C
.
EXPANDS: expanding ploidy and allele frequency on nested subpopulations
.
Bioinformatics
2014
;
30
:
50
60
.
26.
Andor
N
,
Graham
TA
,
Jansen
M
,
Xia
LC
,
Aktipis
CA
,
Petritsch
C
, et al
.
Pan-cancer analysis of the extent and consequences of intratumor heterogeneity
.
Nat Med
2016
;
22
:
105
13
.
27.
Ha
G
,
Roth
A
,
Khattra
J
,
Ho
J
,
Yap
D
,
Prentice
LM
, et al
.
TITAN: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data
.
Genome Res
2014
;
24
:
1881
93
.
28.
Ho
J
,
Tumkaya
T
,
Aryal
S
,
Choi
H
,
Claridge-Chang
A
.
Moving beyond P values: data analysis with estimation graphics
.
Nat Methods
2019
;
16
:
565
6
.
29.
Yu
G
.
Using ggtree to visualize data on tree-like structures
.
Curr Protoc Bioinformatics
2020
;
69
:
e96
.
30.
Blokzijl
F
,
Janssen
R
,
van Boxtel
R
,
Cuppen
E
.
MutationalPatterns: comprehensive genome-wide analysis of mutational processes
.
Genome Med
2018
;
10
:
33
.
31.
Yousif
F
,
Prokopec
SD
,
Sun
RX
,
Fan
F
,
Lalansingh
CM
,
Park
DH
, et al
.
The origins and consequences of localized and global somatic hypermutation [Internet]
.
bioRxiv
;
2018;
Available from
: http://biorxiv.org/lookup/doi/10.1101/287839.
32.
Riva
L
,
Pandiri
AR
,
Li
YR
,
Droop
A
,
Hewinson
J
,
Quail
MA
, et al
.
The mutational signature profile of known and suspected human carcinogens in mice
.
Nat Genet
2020
;
52
:
1189
97
.
33.
Rosales
RA
,
Drummond
RD
,
Valieris
R
,
Dias-Neto
E
,
da Silva
IT
.
signeR: an empirical Bayesian approach to mutational signature discovery
.
Bioinformatics
2017
;
33
:
8
16
.
34.
Rosenthal
R
,
McGranahan
N
,
Herrero
J
,
Taylor
BS
,
Swanton
C
.
deconstructSigs: delineating mutational processes in single tumors distinguishes DNA repair deficiencies and patterns of carcinoma evolution
.
Genome Biol
2016
;
17
:
31
.
35.
Li
S
,
Crawford
FW
,
Gerstein
MB
.
Using sigLASSO to optimize cancer mutation signatures jointly with sampling likelihood
.
Nat Commun
2020
;
11
:
3575
.
36.
Mohammadzadeh
N
,
Love
RP
,
Gibson
R
,
Arts
EJ
,
Poon
AFY
,
Chelico
L
, et al
.
Role of co-expressed APOBEC3F and APOBEC3G in inducing HIV-1 drug resistance
.
Heliyon
2019
;
5
:
e01498
.
37.
D'Antonio
M
,
Tamayo
P
,
Mesirov
JP
,
Frazer
KA
.
Kataegis expression signature in breast cancer is associated with late onset, better prognosis, and higher HER2 levels
.
Cell Rep
2016
;
16
:
672
83
.
38.
Maciejowski
J
,
Chatzipli
A
,
Dananberg
A
,
Chu
K
,
Toufektchan
E
,
Klimczak
LJ
, et al
.
APOBEC3-dependent kataegis and TREX1-driven chromothripsis during telomere crisis
.
Nat Genet
2020
;
52
:
884
90
.
39.
Gupta
RG
,
Somer
RA
.
Intratumor heterogeneity: novel approaches for resolving genomic architecture and clonal evolution
.
Mol Cancer Res
2017
;
15
:
1127
37
.
40.
Chao
A
,
Chiu
C-H
,
Jost
L
.
Phylogenetic diversity measures based on Hill numbers
.
Phil Trans R Soc B
2010
;
365
:
3599
609
.
41.
Yu
Q
,
König
R
,
Pillai
S
,
Chiles
K
,
Kearney
M
,
Palmer
S
, et al
.
Single-strand specificity of APOBEC3G accounts for minus-strand deamination of the HIV genome
.
Nat Struct Mol Biol
2004
;
11
:
435
42
.
42.
Bishop
KN
,
Holmes
RK
,
Sheehy
AM
,
Davidson
NO
,
Cho
S-J
,
Malim
MH
, et al
.
Cytidine deamination of retroviral DNA by diverse APOBEC proteins
.
Curr Biol
2004
;
14
:
1392
6
.
43.
Maiti
A
,
Myint
W
,
Kanai
T
,
Delviks-Frankenberry
K
,
Sierra Rodriguez
C
,
Pathak
VK
, et al
.
Crystal structure of the catalytic domain of HIV-1 restriction factor APOBEC3G in complex with ssDNA
.
Nat Commun
2018
;
9
:
2460
.
44.
Seplyarskiy
VB
,
Soldatov
RA
,
Popadin
KY
,
Antonarakis
SE
,
Bazykin
GA
,
Nikolaev
SI
, et al
.
APOBEC-induced mutations in human cancers are strongly enriched on the lagging DNA strand during replication
.
Genome Res
2016
;
26
:
174
82
.
45.
Hoopes
JI
,
Cortez
LM
,
Mertz
TM
,
Malc
EP
,
Mieczkowski
PA
,
Roberts
SA
, et al
.
APOBEC3A and APOBEC3B preferentially deaminate the lagging strand template during DNA replication
.
Cell Rep
2016
;
14
:
1273
82
.
46.
Chakravarty
D
,
Gao
J
,
Phillips
S
,
Kundra
R
,
Zhang
H
,
Wang
J
, et al
.
OncoKB: a precision oncology knowledge base
.
JCO Precis Oncol
2017
;
1
16
.
47.
Gao
J
,
Aksoy
BA
,
Dogrusoz
U
,
Dresdner
G
,
Gross
B
,
Sumer
SO
, et al
.
Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal
.
Sci Signal
2013
;
6
:
pl1
.
48.
Ghandi
M
,
Huang
FW
,
Jané-Valbuena
J
,
Kryukov
GV
,
Lo
CC
,
McDonald
ER
, et al
.
Next-generation characterization of the cancer cell line encyclopedia
.
Nature
2019
;
569
:
503
8
.
49.
Alexandrov
LB
,
Nik-Zainal
S
,
Wedge
DC
,
Campbell
PJ
,
Stratton
MR
.
Deciphering signatures of mutational processes operative in human cancer
.
Cell Rep
2013
;
3
:
246
59
.
50.
Refsland
EW
,
Stenglein
MD
,
Shindo
K
,
Albin
JS
,
Brown
WL
,
Harris
RS
, et al
.
Quantitative profiling of the full APOBEC3 mRNA repertoire in lymphocytes and tissues: implications for HIV-1 restriction
.
Nucleic Acids Res
2010
;
38
:
4274
84
.
51.
Salter
JD
,
Bennett
RP
,
Smith
HC
.
The APOBEC protein family: united by structure, divergent in function
.
Trends Biochem Sci
2016
;
41
:
578
94
.
52.
Burrell
RA
,
McGranahan
N
,
Bartek
J
,
Swanton
C
.
The causes and consequences of genetic heterogeneity in cancer evolution
.
Nature
2013
;
501
:
338
45
.
53.
Vlachostergios
PJ
,
Faltas
BM
.
Treatment resistance in urothelial carcinoma: an evolutionary perspective
.
Nat Rev Clin Oncol
2018
;
15
:
495
509
.
54.
Davis
CF
,
Ricketts
CJ
,
Wang
M
,
Yang
L
,
Cherniack
AD
,
Shen
H
, et al
.
The somatic genomic landscape of chromophobe renal cell carcinoma
.
Cancer Cell
2014
;
26
:
319
30
.
55.
Nik-Zainal
S
,
Alexandrov
LB
,
Wedge
DC
,
Van Loo
P
,
Greenman
CD
,
Raine
K
, et al
.
Mutational processes molding the genomes of 21 breast cancers
.
Cell
2012
;
149
:
979
93
.
56.
Cortés-Ciriano
I
,
Lee
JJ-K
,
Xi
R
,
Jain
D
,
Jung
YL
,
Yang
L
, et al
.
Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing
.
Nat Genet
2020
;
52
:
331
41
.
57.
Faltas
BM
,
Prandi
D
,
Tagawa
ST
,
Molina
AM
,
Nanus
DM
,
Sternberg
C
, et al
.
Clonal evolution of chemotherapy-resistant urothelial carcinoma
.
Nat Genet
2016
;
48
:
1490
9
.
58.
McGranahan
N
,
Favero
F
,
de Bruin
EC
,
Birkbak
NJ
,
Szallasi
Z
,
Swanton
C
, et al
.
Clonal status of actionable driver events and the timing of mutational processes in cancer evolution
.
Sci Transl Med
2015
;
7
:
283ra54
.
59.
Venkatesan
S
,
Angelova
M
,
Puttick
C
,
Zhai
H
,
Caswell
DR
,
Lu
W-T
, et al
.
Induction of APOBEC3 exacerbates DNA replication stress and chromosomal instability in early breast and lung cancer evolution
.
Cancer Discov
2021
;
11
:
2456
73
.
60.
Lackey
L
,
Law
EK
,
Brown
WL
,
Harris
RS
.
Subcellular localization of the APOBEC3 proteins during mitosis and implications for genomic DNA deamination
.
Cell Cycle
2013
;
12
:
762
72
.
61.
Nowarski
R
,
Wilner
OI
,
Cheshin
O
,
Shahar
OD
,
Kenig
E
,
Baraz
L
, et al
.
APOBEC3G enhances lymphoma cell radioresistance by promoting cytidine deaminase-dependent DNA repair
.
Blood
2012
;
120
:
366
75
.
62.
Stopak
K
,
de Noronha
C
,
Yonemoto
W
,
Greene
WC
.
HIV-1 Vif blocks the antiviral activity of APOBEC3G by impairing both its translation and intracellular stability
.
Mol Cell
2003
;
12
:
591
601
.
63.
Talluri
S
,
Samur
MK
,
Buon
L
,
Kumar
S
,
Potluri
LB
,
Shi
J
, et al
.
Dysregulated APOBEC3G causes DNA damage and promotes genomic instability in multiple myeloma
.
Blood Cancer J
2021
;
11
:
166
.
64.
McDaniel
YZ
,
Wang
D
,
Love
RP
,
Adolph
MB
,
Mohammadzadeh
N
,
Chelico
L
, et al
.
Deamination hotspots among APOBEC3 family members are defined by both target site sequence context and ssDNA secondary structure
.
Nucleic Acids Res
2020
;
48
:
1353
71
.
65.
Senavirathne
G
,
Jaszczur
M
,
Auerbach
PA
,
Upton
TG
,
Chelico
L
,
Goodman
MF
, et al
.
Single-stranded DNA scanning and deamination by APOBEC3G cytidine deaminase at single molecule resolution
.
J Biol Chem
2012
;
287
:
15826
35
.
66.
Chelico
L
,
Pham
P
,
Calabrese
P
,
Goodman
MF
.
APOBEC3G DNA deaminase acts processively 3′ → 5′ on single-stranded DNA
.
Nat Struct Mol Biol
2006
;
13
:
392
9
.
67.
Holtz
CM
,
Sadler
HA
,
Mansky
LM
.
APOBEC3G cytosine deamination hotspots are defined by both sequence context and single-stranded DNA secondary structure
.
Nucleic Acids Res
2013
;
41
:
6139
48
.
68.
Buisson
R
,
Langenbucher
A
,
Bowen
D
,
Kwan
EE
,
Benes
CH
,
Zou
L
, et al
.
Passenger hotspot mutations in cancer driven by APOBEC3A and mesoscale genomic features
.
Science
2019
;
364
:
eaaw2872
.
69.
Petljak
M
,
Dananberg
A
,
Chu
K
,
Bergstrom
EN
,
Striepen
J
,
von Morgen
P
, et al
.
Mechanisms of APOBEC3 mutagenesis in human cancer cells
.
Nature
2022
;
607
:
799
807
.
70.
Green
AM
,
Weitzman
MD
.
The spectrum of APOBEC3 activity: from anti-viral agents to anti-cancer opportunities
.
DNA Repair (Amst)
2019
;
83
:
102700
.
71.
Robertson
AG
,
Kim
J
,
Al-Ahmadie
H
,
Bellmunt
J
,
Guo
G
,
Cherniack
AD
, et al
.
Comprehensive molecular characterization of muscle-invasive bladder cancer
.
Cell
2017
;
171
:
540
56
.
72.
Wang
S
,
Jia
M
,
He
Z
,
Liu
X-S
.
APOBEC3B and APOBEC mutational signature as potential predictive markers for immunotherapy response in non-small cell lung cancer
.
Oncogene
2018
;
37
:
3924
36
.
73.
Middlebrooks
CD
,
Banday
AR
,
Matsuda
K
,
Udquim
K-I
,
Onabajo
OO
,
Paquin
A
, et al
.
Association of germline variants in the APOBEC3 region with cancer risk and enrichment with APOBEC-signature mutations in tumors
.
Nat Genet
2016
;
48
:
1330
8
.
74.
Lindskrog
SV
,
Prip
F
,
Lamy
P
,
Taber
A
,
Groeneveld
CS
,
Birkenkamp-Demtröder
K
, et al
.
An integrated multi-omics analysis identifies prognostic molecular subtypes of non-muscle-invasive bladder cancer
.
Nat Commun
2021
;
12
:
2301
.
75.
LaRue
RS
,
Andrésdóttir
V
,
Blanchard
Y
,
Conticello
SG
,
Derse
D
,
Emerman
M
, et al
.
Guidelines for naming nonprimate APOBEC3 genes and proteins
.
J Virol
2009
;
83
:
494
7
.
76.
Bhattacharya
C
,
Aggarwal
S
,
Kumar
M
,
Ali
A
,
Matin
A
.
Mouse apolipoprotein B editing complex 3 (APOBEC3) is expressed in germ cells and interacts with dead-end (DND1)
.
PLoS One
2008
;
3
:
e2315
.
77.
Caval
V
,
Jiao
W
,
Berry
N
,
Khalfi
P
,
Pitré
E
,
Thiers
V
, et al
.
Mouse APOBEC1 cytidine deaminase can induce somatic mutations in chromosomal DNA
.
BMC Genomics
2019
;
20
:
858
.
78.
Salas-Briceno
K
,
Zhao
W
,
Ross
SR
.
Mouse APOBEC3 restriction of retroviruses
.
Viruses
2020
;
12
:
1217
.
79.
Hollern
DP
,
Xu
N
,
Thennavan
A
,
Glodowski
C
,
Garcia-Recio
S
,
Mott
KR
, et al
.
B cells and T follicular helper cells mediate response to checkpoint inhibitors in high mutation burden mouse models of breast cancer
.
Cell
2019
;
179
:
1191
206
.
80.
Boi
S
,
Kolokithas
A
,
Shepard
J
,
Linwood
R
,
Rosenke
K
,
Van Dis
E
, et al
.
Incorporation of mouse APOBEC3 into murine leukemia virus virions decreases the activity and fidelity of reverse transcriptase
.
J Virol
2014
;
88
:
7659
62
.
81.
Stavrou
S
,
Zhao
W
,
Blouch
K
,
Ross
SR
.
Deaminase-dead mouse APOBEC3 is an in vivo retroviral restriction factor
.
J Virol
2018
;
92
:
e00168
18
.
82.
Hakata
Y
,
Landau
NR
.
Reversed functional organization of mouse and human APOBEC3 cytidine deaminase domains
.
J Biol Chem
2006
;
281
:
36624
31
.
83.
Fantini
D
,
Glaser
AP
,
Rimar
KJ
,
Wang
Y
,
Schipma
M
,
Varghese
N
, et al
.
A carcinogen-induced mouse model recapitulates the molecular alterations of human muscle invasive bladder cancer
.
Oncogene
2018
;
37
:
1911
25
.
84.
Deltourbe
L
,
Lacerda Mariano
L
,
Hreha
TN
,
Hunstad
DA
,
Ingersoll
MA
.
The impact of biological sex on diseases of the urinary tract
.
Mucosal Immunol
2022
;
15
:
857
66
.
85.
Doseth
B
,
Visnes
T
,
Wallenius
A
,
Ericsson
I
,
Sarno
A
,
Pettersen
HS
, et al
.
Uracil-DNA glycosylase in base excision repair and adaptive immunity
.
J Biol Chem
2011
;
286
:
16669
80
.
86.
Chan
K
,
Resnick
MA
,
Gordenin
DA
.
The choice of nucleotide inserted opposite abasic sites formed within chromosomal DNA reveals the polymerase activities participating in translesion DNA synthesis
.
DNA Repair (Amst)
2013
;
12
:
878
89
.
87.
Mikl
MC
,
Watt
IN
,
Lu
M
,
Reik
W
,
Davies
SL
,
Neuberger
MS
, et al
.
Mice deficient in APOBEC2 and APOBEC3
.
Mol Cell Biol
2005
;
25
:
7270
7
.
88.
Rizvi
NA
,
Hellmann
MD
,
Snyder
A
,
Kvistborg
P
,
Makarov
V
,
Havel
JJ
, et al
.
Mutational landscape determines sensitivity to PD-1 blockade in non–small cell lung cancer
.
Science
2015
;
348
:
124
8
.
89.
Steele
CD
,
Abbasi
A
,
Islam
SMA
,
Bowes
AL
,
Khandekar
A
,
Haase
K
, et al
.
Signatures of copy number alterations in human cancer
.
Nature
2022
;
606
:
984
91
.
90.
Wörmann
SM
,
Zhang
A
,
Thege
FI
,
Cowan
RW
,
Rupani
DN
,
Wang
R
, et al
.
APOBEC3A drives deaminase domain-independent chromosomal instability to promote pancreatic cancer metastasis
.
Nat Cancer
2021
;
2
:
1338
56
.