Abstract
To explore the role of NBN as a pan-cancer susceptibility gene.
Matched germline and somatic DNA samples from 34,046 patients were sequenced using Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets and presumed pathogenic germline variants (PGV) identified. Allele-specific and gene-centered analysis of enrichment was conducted and a validation cohort of 26,407 pan-cancer patients was analyzed. Functional studies utilized cellular models with analysis of protein expression, MRN complex formation/localization, and viability assessment following treatment with γ-irradiation.
We identified 83 carriers of 32 NBN PGVs (0.25% of the studied series), 40% of which (33/83) carried the Slavic founder p.K219fs. The frequency of PGVs varied across cancer types. Patients harboring NBN PGVs demonstrated increased loss of the wild-type allele in their tumors [OR = 2.7; confidence interval (CI): 1.4–5.5; P = 0.0024; pan-cancer], including lung and pancreatic tumors compared with breast and colorectal cancers. p.K219fs was enriched across all tumor types (OR = 2.22; CI: 1.3–3.6; P = 0.0018). Gene-centered analysis revealed enrichment of PGVs in cases compared with controls in the European population (OR = 1.9; CI: 1.3–2.7; P = 0.0004), a finding confirmed in the replication cohort (OR = 1.8; CI: 1.2–2.6; P = 0.003). Two novel truncating variants, p.L19* and p.N71fs, produced a 45 kDa fragment generated by alternative translation initiation that maintained binding to MRE11. Cells expressing these fragments showed higher sensitivity to γ-irradiation and lower levels of radiation-induced KAP1 phosphorylation.
Burden analyses, biallelic inactivation, and functional evidence support the role of NBN as contributing to a broad cancer spectrum. Further studies in large pan-cancer series and the assessment of epistatic and environmental interactions are warranted to further define these associations.
The DNA damage repair gene NBN has been included in some multigene clinical testing panels although the evidence for its contribution to cancer susceptibility remains limited. On the basis of an analysis of 34,046 samples subject to tumor normal sequencing of >300 cancer-associated genes, we observed that 1 in 400 individuals carry a germline pathogenic NBN variant, a frequency significantly higher than in the cancer-free population. Rates of pathogenic variants varied between cancer types, observed more frequently in lung and pancreatic cancer, where a higher rate of somatic loss of the wild-type allele was also observed. Functional studies utilizing cellular models of clinically observed NBN pathogenic variants were consistent with a hypomorphic role in binding ability to MRE11 and activation of ATM. These findings support a role for NBN in cancer susceptibility but suggest that larger sample sizes and additional functional analyses will be needed to elucidate the clinical relevance of these novel associations.
Introduction
Biallelic pathogenic germline variants (PGV) in NBN underlie the etiologic basis of Nijmegen breakage syndrome (NBS), an autosomal recessive condition characterized by growth retardation, microcephaly, radiosensitivity, immunodeficiency, and cancer predisposition, mainly to lymphoid malignancies (1–4). NBN is involved in DNA damage response (DDR) forming the MRN complex with MRE11 and RAD50. This complex plays an important role in the early steps of DDR as it acts as a sensor of double-strand breaks and recruiter of multiple downstream proteins, including ATM, H2AX, and BRCA1-containing complexes to achieve DNA repair (5).
A hypomorphic frameshift variant, c.657_661delACAAA (p.K219Nfs*16), hereafter p.K219fs, has been noted to be a founder in the Slavic/Eastern European population (6). This variant leads to the expression of two fragments of 26 and 70 kDa (p26 and p70, respectively), the latter resulting from alternative translation initiation, maintaining binding ability to MRE11 (7). These fragments retain residual function as NBN deficiency results in embryonic and cellular lethality; p70 levels varied between patients with NBS, homozygous for p.K219fs, and lower levels were significantly correlated with a higher risk of developing cancer (8–11).
A significant number of studies on the association of germline NBN variation and cancer risk has been performed in the Eastern European population mainly due to high prevalence of the p.K219fs founder mutation. While initial reports were based on observations that relatives of patients with NBS presented with increased frequencies of multiple cancer types (12), follow-up studies within the Slavic population provided additional granularity with respect to specific cancer types. While several of these studies support the association of NBN p.K219fs with increased risk for prostate cancer (13–15) and lymphoid malignancies (16–18), conflicting evidence exists for a risk association with other cancer types, notably breast cancer and colorectal cancer within this population (19–21). Of note, an increased risk for prostate cancer in carriers of NBN PGVs has also been reported in men of African ancestry (22). Studies in Asian populations have primarily focused on NBN polymorphic variation, namely the p.E185Q variant. Multiple groups reported a potential risk-modifying effect of p.E185Q especially in the context of known environmental stressors linked to the respective cancer type, such as smoking in lung cancers, drinking in hepatocellular carcinoma, and smoking in combination with oral contraceptive use in cervical cancers (23–26). The conflicting contribution of NBN PGVs to breast cancer susceptibility was recently addressed by two large population-based case–control studies that provided no evidence for an association, resulting in downgrading the clinical actionability of NBN PGVs (27–29). However, it remains unclear whether NBN plays a role in the predisposition to other cancer types.
This study addresses the putative association of NBN PGVs with cancer susceptibility by leveraging germline and somatic genomic data from an anonymized cohort of 34,046 patients diagnosed with more than 20 cancer types. To assess the phenotypic spectrum and associations, we performed case–control burden analyses, determined the rates of somatic biallelic inactivation, and conducted cosegregation analyses as well as experiments exploring the functional significance of observed NBN PGVs in cellular model systems.
Materials and Methods
Primary study cohort and sequencing data
NBN variation was assessed in an anonymized set of 34,046 patients who were treated at Memorial Sloan Kettering Cancer Center (MSKCC) from January 2014 until August 2019. The patients received tumor and matched normal DNA sequencing using MSK-IMPACT (Integrated Mutation Profiling of Actionable Cancer Targets) to interrogate 341–468 genes, depending on panel version (30). This study was conducted conforming with the U.S. common rule. Informed written consent was obtained from each subject or each subject's guardian as part of Institutional Review Board (IRB)-approved protocols. This study was approved by the IRB of MSKCC. Detailed clinical data including personal and family history of cancer were available for a subset of patients who consented to receiving germline results for 88 American College of Medical Genetics and Genomics (ACMG) cancer predisposing genes.
The patients were diagnosed with more than 20 cancer types including lung (n = 4,848; 14.1%), breast (4,661; 13.5%), colorectal (3,251; 9.5%), prostate (2,061; 6%), and pancreatic cancers (1,945; 5.6%). The complete list of cancer types is provided in Supplementary Table S1.
Germline and somatic variant calling were performed as described previously (31, 32). Genetic ancestry of patients was inferred from common SNPs genotyped on MSK-IMPACT (33), and the data were filtered to exclude variants with a corresponding population minor allele frequency (MAF) in gnomAD v.2.1.1 greater than 1%.
Variant annotation and classification
Data were annotated with ANNOVAR (34) using the gnomad211_exome, dbnsfp41a (35), and ClinVar (January 23, 2021) databases. PathoMAN (36) was used to automatically classify variants according to the ACMG/Association for Molecular Pathology (AMP) guidelines into benign/likely benign (B/LB), variant of unknown significance (VUS), and pathogenic/likely pathogenic (P/LP). The detailed ACMG guidelines used for each automated classification are summarized in Supplementary Table S2.
Because of the truncating nature of all known pathogenic NBN variants, a classification system based solely on variant type was also utilized: silent/synonymous, missense, and truncating/disruptive. A REVEL score ≥ 0.7 was used to identify potentially deleterious missense variants based on its good performance compared with other in silico predictors in a recent comparative study (37, 38).
Association analysis of NBN variants and collapsing burden tests stratified by population-specific classification
Case–control comparisons were performed for each population separately, including European, European [Ashkenazi Jewish (ASJ)], East Asian, South Asian, and African ancestries. Carriers with admixed ancestries were not included in burden analyses. The allele count differences of NBN germline variants were resolved both at the variant and gene levels (collapsed burden test) and were compared using Fisher exact test in R software version 4.1.2. P values lower than 0.05 were considered statistically significant.
Somatic second hits
Loss of heterozygosity (LOH) for germline variants was evaluated using the framework described previously (31, 32, 39). The resulting calls identified both the loss of wild-type allele (WT-loss) and that of the variant allele (VAR-loss). In addition, cases where a somatic truncating NBN mutation was also identified are reported. Variant phasing was possible only in one case thanks to the proximity of the PGV and the somatic mutation. In the remaining cases, it was not possible to determine variant phasing.
Replication cohorts
To validate the findings from MSK-IMPACT patients, we relied on data from 26,407 pan-cancer patients sequenced at Ambry Genetics using clinically validated multigene testing panels (40). We also used The Cancer Genome Atlas project (TCGA) whole-exome sequencing (WES) data from 10,268 patients with cancer to validate LOH in germline carriers. The cancer types included in these two cohorts are detailed in Supplementary Tables S3 and S4.
For TCGA data, BAM files were downloaded from the Genomic Data Commons database. Germline and tumor variant calling were performed using Strelka2 germline caller on either normal or tumor data (41). Variants with a low overall quality were removed, keeping only those flagged as “PASS” in the FILTER column of the VCF files. Germline variant calls were annotated with ANNOVAR using the gnomad211_exome and ENSEMBL v88 (Gencode v26) databases. We applied the same MAF filter used for MSK-IMPACT data to exclude common variants.
LOH was estimated using the variant allele frequency (VAF) of the germline variant from the tumor data, and the resulting variant calls were used to identify WT-loss (VAF ≥ 0.7) and VAR-loss (VAF ≤ 0.3). Germline variants that were missing in the corresponding tumor data were manually curated by inspecting the BAM files using Integrative Genomics Viewer. Similarly, germline variant calls with insufficient overall sequencing coverage in the tumor were discarded.
Site-directed mutagenesis
The pCR4-TOPO plasmid containing human NBN cDNA (clone BC136803) was purchased from TransOMIC. The NBN cDNA was cloned into the pLX302 lentiviral expression plasmid, obtained from David Root (RRID: Addgene_25896). The NBN mutants were generated from the WT NBN plasmid using the QuickChange II XL Site-Directed Mutagenesis Kit (Agilent).
Cell culture and transfections
SV40-transformed patient-derived NBS-ILB1 cells, kindly supplied by Patrick Concannon (University of Florida, Gainesville, FL) were grown in DMEM, supplemented with 15% FBS and 1% penicillin-streptomycin. Hek293T cells (ATCC, catalog no. CRL-3216, RRID: CVCL_0063) were maintained in DMEM supplemented with 10% FBS and 1% penicillin-streptomycin. Cell cultures were maintained in a humidified incubator at 37°C in 5% CO2 and tested for Mycoplasma before being used in experiments. Transfections were carried out with the Amaxa Cell Line Nucleofector Kit V (Lonza) and Nucleofector IIb device according to the manufacturer's instructions. Viral vectors, cotransfected with psPAX2 (RRID: Addgene_12260) and pseudotyped with VSV-G were produced in Hek293T cells using Lipofectamin2000 transfection reagent. The virus supernatant was concentrated by centrifugation for 90 minutes at 20,000 RPM at 4°C and pellets were dissolved in OptiMEM (Gibco). Transduction of cells with virus supernatant was carried out in the presence of 6 μg/mL Polybrene. Stably transfected cell lines were generated by selection with 0.5 μg/mL puromycin.
Real-time PCR
RNA was extracted 24 hours after transduction using the RNeasy Mini Kit (Qiagen) and reverse transcribed with the ReadyScript cDNA Synthesis Mix (Sigma-Aldrich). Quantitative real-time PCR analyses were performed on an ABI PRISM 7900HT Sequence Detection System using the Power SYBR Green PCR Master Mix (Life Technologies) according to the manufacturer's instructions. Following initial incubation for 10 minutes at 95°C, amplification was performed for 40 cycles at 95°C for 15 seconds and 60°C for 1 minute. The RPL32 gene was used as the internal standard. Analysis was performed on the basis of the comparative CT method. Values reported are mean of triplicate experiments. The following primer sequences were used: hsRPL32_F, CATCTCCTTCTCGGCATCA;
hsRPL32_R, AACCCTGTTGTCAATGCCTC; hsNBN_F1, GGCTTTTCCCGAACTTTGAAG
hsNBN_R1, AGCAGTTTTCCCAGAGACATC.
Western blotting
Protein lysates were prepared in RIPA buffer (Pierce), supplemented with Halt protease and phosphatase inhibitor cocktail (Thermo Fisher Scientific). Samples were run on 4%–12% gradient Bis-Tris SDS-PAGE gels (Invitrogen), transferred onto polyvinylidene difluoride membranes (Bio-Rad), and probed with antibodies against NBN (1:5,000), MRE11 (1:5,000), KAP1, phospho-KAP1 (1:2,500; Abcam), and GAPDH (1:1,000; Santa Cruz Biotechnology, catalog no. sc-20357, RRID: AB_641107). Horseradish peroxidase–conjugated or fluorophore-conjugated secondary antibodies were detected using ECL Prime Western Blotting Detection Reagent (GE Healthcare). Antibodies raised against human NBN and MRE11 proteins were custom made and kindly provided by the Petrini lab at MSKCC (42).
Cell viability assays
For viability assessment following treatment with γ-irradiation, cells were seeded into 96-well plates (8 replicate wells per cell line) 24 hours prior to treatment. Cells were treated with the indicated doses emitted from a cesium 137 (137Cs) radiation source. Cell viability was measured after 72 hours using the CellTiter-Glo Luminescent Cell Viability Assay (Promega).
Data availability
Raw data for this study were generated at MSKCC for the primary study cohort, and at Ambry Genetics and TCGA for the replication series. Derived data supporting the findings of this study are available from the corresponding author upon request.
Results
Assessment of NBN germline variants in a large cohort of pan-cancer patients
Our analysis identified 978 carriers of 188 unique rare NBN germline variants (population MAF < 1%, ClinVar B/LB with two gold stars excluded), as well as two carriers of germline copy-number alterations resulting in NBN deletion. Truncating variants represented approximately 0.25% of the entire series (1/411 patients), consisting of 83 carriers of 32 variants, all heterozygous with the exception of 1 patient of ASJ ancestry with a homozygous splice-site variant, c.481-2A > T, who was not diagnosed with NBS but developed three primary metachronous cancers including thyroid cancer at age 33, renal cell carcinoma at age 61, and signet ring cell gastric cancer at age 63. Among carriers of truncating variants, 40% (33/83) harbored the Slavic founder p.K219fs, while the rest of carriers showed variants spanning the entire NBN coding sequence, where a notable fraction affected N-terminal residues and/or functionally relevant domains (Fig. 1A). We also identified 24 carriers (0.07% of the series) of predicted deleterious missense variants (REVEL score ≥ 0.7), 10/24 (41%) of whom carried c.788T>C (p.F263S), a variant identified mainly in the African population and reported as LB/VUS in ClinVar. Data from Ambry showed a similar frequency of carriers of NBN PGVs, consisting of 0.23% (60/26,407, 1/440 patients).
Assessment of the positivity rate of NBN truncating variants per cancer type revealed significant differences between the frequently diagnosed tumor types such as bladder cancer (0.56%; 6/1,064), glioma (0.46%, 7/1,536), lung (0.4%; 20/4,941), pancreatic (0.36%; 7/1,934), prostate (0.19%; 4/2,059), breast (0.19%; 9/4,725), ovarian (0.16%; 2/1,246), and colorectal (0.06%; 2/3,255) cancers, among other cancer types (Fig. 1B). These findings point to a potential preferential association of NBN PGVs with certain solid tumors over others that had been historically linked to NBN. It is important to note that the positivity rate per tumor type was only partially reproduced in Ambry's cohort, where we observed concordance for pancreatic (0.49%; 7/1,433), breast (0.23%; 39/17,021), and colorectal cancer (0.14%; 3/2112; Supplementary Table S4). Of note, this cohort did not include patients with lung or bladder cancer, as these are not traditionally referred for germline-only testing. There are significant differences in the cancer types included in each cohort, and the sample sizes for many tumor types are insufficient to draw robust conclusions. For instance, there were only 13 cancer types with >1,000 samples in our main series (Supplementary Table S1). In addition, bladder cancer had the highest rate of all tumors in our primary cohort but these findings were not replicated in TCGA cohort (2/412, 0.49%) as other cancer types had significantly higher rates such as gastric (4/443, 0.90%) and colorectal cancer (4/595, 0.67%; Supplementary Table S3).
Among the 83 patients with NBN truncating variants, 10 (12%) also carried a P/LP variant in another cancer-associated gene, as reported in ClinVar (accession time: early 2021). Four of these multicarriers were diagnosed with breast cancer and harbored pathogenic variants in BRCA1 (n = 3 patients) and ATM (1). In addition, two were diagnosed with lung cancer (CHEK2 or PARK2 carriers), one with prostate (BRCA2), one with melanoma (DDR2), one with pancreatic neuroendocrine tumor (MEN1), and one with soft-tissue sarcoma (FANCC; Fig. 1C; Supplementary Table S5). Of note, only one multicarrier was observed in Ambry's data, a patient diagnosed with breast cancer who in addition to NBN, also carried an ATM PGV (Supplementary Table S6). Overall, these results point to a likely modulatory role of NBN in most of these multicarriers, as the resulting tumor lineages could be explained mainly by the alternative PGVs in established cancer susceptibility genes. Excluding these multicarriers did not significantly affect the observed cancer spectrum illustrated in Fig. 1B.
LOH
To further characterize the role of NBN germline variants, we assessed the rates of WT-loss and VAR-loss in the corresponding tumors. This analysis revealed a significantly increased WT-loss rate in P/LP variants compared with B/LB, as classified by PathoMAN (P = 0.002; OR = 2.7; CI: 1.4–5.5), and also when comparing truncating variants with silent/synonymous variants (P = 0.00003; OR: 3.9; CI: 2–7.6; Fig. 1D). Conversely, no statistically significant differences were noted for VAR-loss. These findings were replicated in TCGA data which shows a similar trend, with statistically significant results when comparing WT-loss in truncating variants with missense (P = 0.02; OR = 4.1; CI: 1.1–13.2), and borderline when comparing truncating with silent (P = 0.06; OR = 3.2; CI = 0.75–13). Similarly, no differences were noted for VAR-loss (Supplementary Fig. S1). Of note, of the three MRN complex genes, increased WT-loss rate in P/LP variant carriers was only observed for NBN and not for MRE11 or RAD50 in our primary cohort (Supplementary Fig. S2).
In addition, we studied the rates of WT-loss per cancer type and observed a trend for higher biallelic inactivation in lung and pancreatic cancers in comparison with breast and colorectal cancers, further supporting the notion of a tumor-specific role for NBN (Supplementary Fig. S3). However, the statistical power of these associations was impacted by the smaller sample sizes within subgroups. Of note, we identified a germline carrier of p.K219fs with somatic p.G214*, in trans, that was diagnosed with lung cancer, supporting our observations. Overall, mutations or gene fusions that could act as somatic second hits were infrequent in our data and including them did not yield different results than those obtained when focusing only on LOH (Supplementary Table S7; Supplementary Fig. S4).
Case–control burden analyses
The frequencies of germline NBN variants identified in patients with cancer (MSK-IMPACT) were compared with noncancer controls reported in gnomAD v2.1.1 (WES data) to identify the enrichment of individual variants (variant-centered analysis) or the overall enrichment of (predicted) pathogenic variants (gene-centered). We leveraged genetically inferred ancestries performed by the Center of Molecular Oncology at MSKCC to guarantee population-matched comparisons (33).
Variant-centered analyses identified enrichment of several NBN variants including the Slavic founder p.K219fs (pan-cancer cases: 0.078% vs. controls: 0.035%; P = 0.0018; OR = 2.22; CI: 1.3–3.6) and c.425A>G (p.N142S; 0.075% vs. 0.019%; P = 9.6E-7; OR = 4; CI: 2.2–7.6) in the European population. Although limited by sample size, we also identified enrichment of one missense variant in the South Asian population, c.278C>T (p.S93L; 1.1% vs. 0.42%; P = 0.002; OR = 2.6; CI: 1.35–4.63). All three above mentioned variants were also significantly enriched in cases when compared with gnomAD v3 noncancer controls (whole-genome sequencing data; Supplementary Table S8; Supplementary Fig. S5). The pan-cancer association of p.K219fs was replicated in Ambry's data (0.06%; P = 0.032; OR = 1.82; CI: 1.02–3.2), but not that of p.N142S (Supplementary Table S9). We could not assess the frequency of p.S93L. Similarly, TCGA data also support an association of p.K219fs (0.07%; P = 0.033; OR = 2.09; CI: 1–4.1) but not that of p.N142S. In this dataset, we also identified four carriers of p.S93L whose ancestry could not be determined hampering a population-matched enrichment analysis. Both p.N142S and p.S93L are considered likely benign by several laboratories in ClinVar, occurring at poorly conserved residues and not supported by in silico evidence.
Overall, truncating variants were observed at double the frequency in MSK patients with cancer (83/34,046; 0.245%; 1/411) compared with controls (141/118,319; 0.12%; 1/840), this difference being statistically significant (P < 0.0001; OR = 2; CI: 1.5–2.7). Similar results were obtained for patients with cancer sequenced at Ambry (60/26,407; 0.23%; 1/440; P < 0.0001; OR = 1.9; CI: 1.4–2.6). Interestingly, the difference remained significant when excluding the Slavic founder from the analysis in MSK patients (50/34,046: 0.147%: P = 0.002; OR = 1.72; CI: 1.2–2.44), and in Ambry's data (38/26,407; 0.144%; P = 0.002; OR = 1.72; CI: 1.2–2.44) compared with gnomAD v2.1.1 (101/118,319; 0.085%). Similar findings were noted when focusing only on the European population (Supplementary Table S9).
Gene-level collapsing analysis revealed the overall enrichment of P/LP variants, as classified by PathoMAN, in cases compared with gnomAD v2.1.1 controls in the European population (0.14% vs. 0.07%; P = 0.004; OR = 1.91; CI: 1.32–2.74). A similar trend was also observed in the African (0.15% vs. 0.07%), ASJ (0.12% vs. 0.09%), and Eastern Asian (0.05% vs. 0.03%) populations, albeit not statistically significant (Supplementary Table S10). No enrichment of predicted deleterious missense variants in NBN was observed in cases compared with controls (Supplementary Table S11).
Caution must be exercised when interpreting these burden analyses as they relied on publicly available data (gnomAD v2.1.1 noncancer) as the cancer-free population, and these were sequenced using a different method (exome/genome sequencing) compared with MSK-IMPACT or Ambry's datasets (multigene testing panels). Notably, the rate of carriers of NBN PGVs varies in other control populations (27, 28). Finally, potential differences in bioinformatic pipelines and filtering strategies that were used to process and filter the data in these various studies may be responsible for the differences observed. Here, despite technical differences in genotype assessment, we applied the same filtering approach to our cohorts to maximize consistency.
Cosegregation studies in patients with a family history of cancer
Cosegregation analysis was performed for kindreds with confirmed NBN PGVs that reported a family history of cancer, ascertained either in MSKCC or through the Prospective Registry of Multiplex Testing (PROMPT). The proband of family A, diagnosed with breast cancer at age 42, was a heterozygous carrier of c211_212insGA (p.N71fs). This same variant was documented in the proband's daughter, diagnosed with acute lymphoblastic leukemia, as well as in the proband's maternal aunt, diagnosed with bladder cancer at age 67 (Fig. 2A).
In family B, the proband was a carrier of c.1903A>T (p.K635*), that appears to be an ASJ founder given than 11 of 12 of the identified carriers had a clear ASJ ancestry in addition to this variant being absent in non-ASJ European individuals in gnomAD (both v2.1.1 and v3). The proband's nephew, diagnosed with testicular cancer at age 20 was also documented to be a carrier of this variant, while the proband's brother, diagnosed with appendiceal cancer at age 65, was not (Fig. 2B).
The proband of family C, diagnosed with breast cancer at age 54, was a carrier of two variants: NBN c.37+2dupT and RAD51D c.556C>T (p.R186*), the first inherited from her mother, diagnosed with melanoma at age 80, and the second inherited from her father, diagnosed with prostate cancer at age 75 (Fig. 2C). Similarly, the proband of family D, cancer free at age 31, was also a carrier of two PGVs, NBN p.K219fs and ATM c.2849T>G (p.L950R). Her mother, diagnosed with breast cancer at age 40, was not a carrier of either variant, while the paternal uncle, diagnosed with prostate cancer and lymphoma at age 66, was a carrier of the ATM variant (Fig. 2D).
In family E, the proband was a heterozygous carrier of p.K219fs, so was her husband, both cancer free at age 35. One of their daughters was found to be a heterozygous carrier, while the other was homozygous and affected by NBS. The proband's mother was also a heterozygous carrier and developed endometrial cancer at age 55.
Cosegregation study of missense NBN variants was performed in four kindreds and summarized in Supplementary Fig. S6. Overall, because of small kindred sizes, cosegregation analysis was limited by the inability to calculate logarithm of the odds (LOD) scores. Further limiting agnostic analysis, there was likely an ascertainment bias toward breast cancer in kindreds available for this study, limiting sample sizes to study individual associations with other cancer types.
Early truncating variants in NBN lead to expression of C-terminal protein fragments by alternative translation initiation and show a hypomorphic phenotype
Two novel NBN protein truncating variants, c.56delT (p.L19*) and c211_212insGA (p.N71fs), were selected for further functional studies; both of these variants create a stop codon early in the N-terminal part of NBN, disrupting the forkhead-associated domain and the breast cancer C-terminal 1 domain (BRCT1), both critical for the retained NBN functionality in p.K219fs models (p26 fragment; ref. 9). The p.L19X variant was found in a patient with multiple primary cancers of the prostate, lung, and blood (chronic lymphocytic leukemia) while p.N71fs was identified in several members of a family described above (Fig. 2A). To gain insight into the functional impact of the selected NBN variants, we stably overexpressed the mutant cDNAs in a previously well-characterized NBN-deficient cell line, NBS-ILB1, a SV40-transformed fibroblast cell line established from a NBS patient homozygous for the Slavic founder variant that does not express detectable levels of the p70 fragment.
Overexpression of p.L19* and p.N71fs cDNAs was confirmed by qRT-PCR (Fig. 3A) and Western blot analysis (Fig. 3B), and resulted in the presence of a truncated protein fragment of around 45 kDa (p45). Further analysis by separation into nuclear and cytoplasmic protein fractions revealed that these truncated NBN fragments were imported into the nucleus and likely bind to MRE11 because nuclear import of the latter is passive and depends on prior binding to NBN via their interaction domain. To verify this hypothesis, we generated both N-terminal and C-terminal HA-tagged versions of the wild-type and mutant cDNA constructs. No NBN protein was detected on the basis of the N-terminal tagged fragments, while the C-terminal tagged proteins were visualized as the previously identified 45 kDa fragment following immunoprecipitation (Fig. 3C). In addition, co-immunoprecipitation of MRE11 confirmed binding to this truncated C-terminal NBN fragment.
We hypothesized that the mechanism of generation of these C-terminal NBN fragments involves alternative translation initiation from a methionine downstream of the start codon at a location that would lead to a fragment of the observed size of approximately 45 kDa. Putative methionines were identified, and several candidates were mutated to leucines to abrogate translation initiation. From the three sites tested (p.M353, p.M389, and p.M395), the p.M389L variant led to a strong reduction of p45 expression while M395L resulted in a lower abundance of this fragment, indicating that the latter might be a less productive translation initiation site (Fig. 3D). Of note, no significant differences in residue conservation were observed between the three tested methionines (Supplementary Fig. S7).
Subsequent analysis explored the ability of these truncated fragments to mediate NBN function in DNA damage repair. NBS-ILB1 cells overexpressing either the wild-type or mutant NBN fragments produced by p.L19*/p.N71fs mutations as well as two control mutations were exposed to γ-irradiation at two different doses and viability was assessed thereafter (Fig. 3E). As controls, we overexpressed p.K219fs which is known to exhibit impaired NBN function as well as the p.E185Q mutant which is a common variant observed at allele frequencies of 0.3–0.45 across populations and thus considered benign. At a low γ-irradiation dose of 2 Gy the mutants showed a significantly (p.L19*, P ≤ 0.05; p.N71fs, P ≤ 0.0001) reduced capacity to rescue cell viability compared with the wild-type and p.E185Q controls but slightly above the parental NBS-ILB1 cells and the p.K219fs mutants. At the higher dose of 4 Gy, p.L19* and p.N71fs mutants displayed a significantly reduced ability to rescue cell viability comparable with the known deficient models (p.L19* and p.N71fs, P ≤ 0.0001). Finally, as another readout of NBN function, the activation of the ATM axis of the DNA damage response was measured by phosphorylation of the ATM target KAP1 at Serine 824 in response to ionizing radiation–induced DNA damage (Fig. 3F). This analysis showed a strong activation of KAP1 in NBS-ILB1 WT cells while both p.L19* and p.N71fs mutant overexpressing cells showed a reduced KAP1-S824 signal.
Discussion
Although associated with a recessive syndrome of cancer susceptibility, as well as an emerging role in clonal hematopoiesis, the significance of NBN heterozygous pathogenic variants in cancer predisposition continues to be defined (43). Here, analysis of NBN germline variation in a large dataset comprising more than 34,000 patients diagnosed with cancer, confirmed the observed lack of association with breast cancer (27, 28), but suggests a potential role for NBN across other cancer phenotypes. This statistically significant overall association was the result of the superposition of weaker trends of subsets including lung, pancreas, and soft-tissue sarcomas, among others, with somatic WT-loss observed primarily in lung and pancreatic tumors. In these subsets, a functional loss and/or attenuation coupled with LOH in tumors could be contributing to tumorigenesis. This observation was supported by the clinical phenotypes of the 33 carriers of the Slavic founder, where 10/33 (30%) were diagnosed with lung cancer, 4/33 (12%) with pancreatic, and only 3/33 (9%) with breast cancer. There was also no association of NBN PGVs with prostate cancer, where germline variants have recently been documented in studies seeking to define hallmarks of aggressivity and susceptibility to targeted therapies (15, 44, 45). While recent burden analyses have suggested a significant association with ovarian cancer risk (46, 47), this subset, as with breast cancer, failed to reach statistical significance in our data.
The major finding of this study, the pan-cancer association of NBN PGVs and cancer in the primary (MSK) dataset, were supported by an independent ascertainment. The observed 2-fold OR, (CI: 1.5–2.7; P < 0.0001) in the primary dataset was quite similar to the 1.9 OR (CI: 1.4–2.6) in the orthogonal dataset from Amby (P < 0.0001), with the association still significant when excluding the Slavic founder from the analysis in MSK or Ambry's ascertainments compared with public controls.
Given the hypomorphic nature of NBN truncating alleles, including the novel N-terminal variants identified in this report, it will be important to functionally characterize those variants located further downstream in the coding sequence. The generation of truncated proteins that are partially functional, such as the previously described p26/p70 fragments or the novel p45 fragments documented in this study, supports a hypomorphic role in binding ability to MRE11 and activation of ATM. Whether this model could be expanded to variants that affect residues beyond the preferential alternative translation initiation site that we identified remains to be determined. It seems likely to assume that efficiency of translation for the novel p45 fragment would vary between patients, with lower levels mediating increased genomic instability and thereby cancer risk, as was observed in the case of the p70 fragment and should thus be explored further (8).
In addition, multiple reports suggested an association of missense variants, such as c.643C>T (p.R215W) and c.511A>G (p.I171V), with increased cancer risk (48–50). For these two variants, we did not observe an increased frequency in patients with cancer compared with public controls. This observation together with the relatively higher frequency of these compared with other rare variants supports the absence of, or modest contribution of these missense variants as low penetrance alleles. Of note, although our analyses identified the enrichment of two missense variants in MSK cases compared with controls, namely c.425A>G (p.N142S) in the European population and c.278C>T (p.S93L) in the South Asian population, we did not observe such enrichment in two independent replication cohorts, further supporting their suggested benign nature (Supplementary Table S9).
Epistatic and oligogenic interactions of hypomorphic NBN variants with variants in other DNA repair genes will also require further inquiry; polygenic effects could contribute to the observed pan-cancer risk profile. The key interaction domains in MRE11, RAD50, and NBN have been studied to understand the functions of the MRN complex in DNA repair as well as in the activation of ATM and ensuing activation of DNA damage–dependent checkpoint pathways (51, 52). In our cohort, where PGVs of NBN were observed, these served to exclude PGVs or predicted deleterious variants in the other two genes of the MRN complex (Supplementary Table S5).
Although the overall sample size of this study was large, the limited number of heterozygous carriers of NBN PGVs in tumor subsets limited statistical associations. Other limitations of the study included the high proportion of patients diagnosed with advanced solid tumors, and the skewed distribution of tumor types sequenced in favor of those potentially benefiting from targeted therapies based on somatic genomic findings. In particular, cases with hematologic malignancies have only recently been added to the somatic and germline pipeline utilized here, limiting assessment of NBN germline variation in that context. In addition, while a considerable effort was made to ensure the accuracy of available clinical annotation, data were limited in some subsets, potentially leading to underestimates of those with multiple primary cancers (MPC) as discussed in a separate analysis of these data (53). This aspect however was available in Ambry's data where we identified 15/60 (25%) carriers of NBN PGVs as patients with MPC, lacking PGVs in established cancer susceptibility genes (Supplementary Table S6). In addition, while matched normal-tumor DNA sequencing is a convenient approach to exclude clonal hematopoiesis (CH) as the originating mechanism of the observed NBN germline variants, we cannot rule out CH for variants where a VAR-loss was observed in the tumors. And finally, while the computational algorithm that was used to call LOH is robust as it factors in tumor purity and ploidy (39), it is not sheltered from over/under calling these events in the absence of orthogonal validation.
In summary, the results here support the role of NBN PGVs as weak-moderate pan-cancer predisposing alleles, although there were no significant tumor specific associations that could be resolved. The biological plausibility for these associations is supported by genomic observations of preferential LOH of the WT allele in several tumor types, as well as in vitro experiments utilizing cellular models demonstrating significant functional consequences of some recurrent NBN PGVs. Taken together, these findings suggest the need for confirmation in large pan-cancer cohorts, as well as future studies exploring the potential role of epistatic as well as nongenetic interactions of NBN variants that may impact penetrance, phenotype, and therapeutic response.
Authors' Disclosures
S. Belhadj is an employee of Ambry Genetics at the time of publication. M. Mehine reports other support from Sigrid Jusélius Foundation during the conduct of the study. F. Couch reports grants from NIH during the conduct of the study; F. Couch also reports other support from GRAIL and Ambry Genetics, as well as personal fees from AstraZeneca outside the submitted work. Z. Stadler reports other support from Genentech/Roche, Adverum, Gyroscope Therapeutics, RegenexBio, Optos, Outlook Therapeutics, Regeneron, and Neurogene outside the submitted work. M. Robson reports honoraria from Research to Practice, Intellisphere, Physicians’ Education Resource, MyMedEd, MJH Holdings, and Change Healthcare; uncompensated consulting or advisory work with Artios Pharma, AstraZeneca, Daiichi Sankyo, Epic Sciences, Merck, Pfizer, Tempus Labs, and Zenith Pharma; institutional research funding from AstraZeneca, Merck, and Pfizer; editorial support from AstraZeneca and Pfizer; and research support from NIH/NCI Cancer Center Support Grant P30 CQA008748 and the Breast Cancer Research Foundation. M. Berger reports personal fees from Eli Lilly, AstraZeneca, and PetDx outside the submitted work. R. Karam reports other support from Ambry Genetics during the conduct of the study, as well as other support from Ambry Genetics outside the submitted work. S. Topka reports grants from Robert and Kate Niehaus Center for Inherited Cancer Genomics, Breast Cancer Research Foundation, Sharon Corzine Research Fund, NCI Core grant P30 CA008748, NIH/NCI (1R01 CA217169 and 1R01 CA234617); U19CA203654, and Andrew Sabin Family Foundation during the conduct of the study. K. Offit is a co-founder of AnaNeo Therapeutics; however, shares have not been allotted (i.e. no financial interest) and the company is not targeting the pathway described in this article. No disclosures were reported by the other authors.
Authors' Contributions
S. Belhadj: Conceptualization, data curation, formal analysis, validation, investigation, visualization, methodology, writing–original draft, writing–review and editing. A. Khurram: Investigation, writing–review and editing. C. Bandlamudi: Resources, data curation, formal analysis, validation, investigation, writing–review and editing. G. Palou-Márquez: Data curation, formal analysis, validation, investigation, visualization, methodology. V. Ravichandran: Writing–review and editing. Z. Steinsnyder: Writing–review and editing. T. Wildman: Writing–review and editing. A. Catchings: Writing–review and editing. Y. Kemel: Resources, data curation, investigation, writing–review and editing. S. Mukherjee: Resources, data curation, formal analysis, writing–review and editing. B. Fesko: Investigation, writing–review and editing. K. Arora: Methodology, writing–review and editing. M. Mehine: Data curation, validation, investigation, methodology, writing–review and editing. S. Dandiker: Writing–review and editing. A. Izhar: Investigation, writing–review and editing. J. Petrini: Investigation, writing–review and editing. S. Domchek: Writing–review and editing. K.L. Nathanson: Writing–review and editing. J. Brower: Investigation, writing–review and editing. F. Couch: Writing–review and editing. Z. Stadler: Writing–review and editing. M. Robson: Writing–review and editing. M. Walsh: Writing–review and editing. J. Vijai: Writing–review and editing. M. Berger: Resources, methodology, writing–review and editing. F. Supek: Resources, validation, writing–review and editing. R. Karam: Resources, validation, writing–review and editing. S. Topka: Conceptualization, resources, data curation, formal analysis, validation, investigation, visualization, methodology, writing–original draft, writing–review and editing. K. Offit: Conceptualization, resources, supervision, funding acquisition, writing–original draft, writing–review and editing.
Acknowledgments
This work was supported by the Robert and Kate Niehaus Center for Inherited Cancer Genomics; the Breast Cancer Research Foundation, and the Sharon Corzine Research Fund; MSKCC is supported by the NCI Core grant P30 CA008748; NIH/NCI (1R01 CA217169 and 1R01 CA234617); U19CA203654. S. Belhadj is supported by a fellowship granted by the Andrew Sabin Family Foundation. M. Mehine is supported by the Sigrid Jusélius Foundation.
The publication costs of this article were defrayed in part by the payment of publication fees. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/).