Background:

Genetic susceptibility is associated with nasopharyngeal carcinoma (NPC). We previously identified rare variants potentially involved in familial NPC and common variants significantly associated with sporadic NPC.

Methods:

We conducted targeted gene sequencing of 20 genes [16 identified from the study of multiplex families, three identified from a pooled analysis of NPC genome-wide association study (GWAS), and one identified from both studies] among 819 NPC cases and 938 controls from two case–control studies in Taiwan (independent from previous studies). A targeted, multiplex PCR primer panel was designed using the custom Ion AmpliSeq Designer v4.2 targeting the regions of the selected genes. Gene-based and single-variant tests were conducted.

Results:

We found that NPC was associated with combined common and rare variants in CDKN2A/2B (P = 1.3 × 10−4), BRD2 (P = 1.6 × 10−3), TNFRSF19 (P = 4.0 × 10−3), and CLPTM1L/TERT (P = 5.4 × 10−3). Such associations were likely driven by common variants within these genes, based on gene-based analyses evaluating common variants and rare variants separately (e.g., for common variants of CDKN2A/2B, P = 4.6 × 10−4; for rare variants, P = 0.04). We also observed a suggestive association with rare variants in HNRNPU (P = 3.8 × 10−3) for NPC risk. In addition, we validated four previously reported NPC risk–associated SNPs.

Conclusions:

Our findings confirm previously reported associated variants and suggest that some common variants in genes previously linked to familial NPC are associated with the development of sporadic NPC.

Impact:

NPC-associated genes, including CLPTM1L/TERT, BRD2, and HNRNPU, suggest a role for telomere length maintenance in NPC etiology.

While Epstein–Barr virus (EBV) infection is a necessary cause for virtually all undifferentiated nasopharyngeal carcinoma (NPC), only a small proportion of infected individuals develop NPC (1, 2). Inherited genetic susceptibility is hypothesized to be an important risk factor for NPC among EBV-infected individuals, and agnostic and targeted studies of sporadic and familial NPC support this notion (3, 4).

We recently completed a whole-exome sequencing study among 251 individuals from 97 multiplex NPC families from Taiwan in which novel, rare variants were implicated in the development of familial NPC (5). These variants are located in genes potentially involved in notch signaling (NOTCH1, DLL3, LFNG, MAML1, MFNG, PSEN2), magnesium transport (NIPAL1), EBV entry into epithelial cells (ITGB6), modulation of EBV (BCL2L12, NEDD4L, LAMC2), telomere biology (CLPTM1L, BRD2, HNRNPU), DNA repair (PRKDC, MLH1), or modulation of cAMP signaling (RAPGEF3). Separate findings from a meta-analysis of sporadic NPC noted significant associations for common polymorphisms within genes involved in telomere biology (CLPTM1L/TERT), apoptosis (MECOM, TNFRSF19), and cell-cycle regulation (CDKN2A/2B; ref. 6). Taken together, these studies of familial and sporadic NPC point to a set of candidate genes that might be important in NPC pathogenesis.

To follow up on these findings, we performed targeted gene sequencing of 20 genes/gene regions in an independent set of 1,757 NPC cases and controls from two studies in Taiwan. In this study, we aimed (i) to evaluate whether variability in genes involved in familial NPC might also be associated with sporadic NPC and (ii) to determine whether additional polymorphisms in genes previously associated with sporadic NPC could be identified.

Study population

The two case–control studies in Taiwan have been described previously (ref. 7; W.L. Hsu et al., submitted for publication). In brief, the first case–control study, conducted between 1991 and 1994, recruited incident NPC cases from the Taipei metropolitan area and age, sex, and residential area–matched controls. Of 378 eligible cases and 372 controls, 369 cases (98%) and 320 controls (86%) agreed to participate. The second case–control study, conducted between 2010 and 2014, recruited prevalent and incident NPC cases diagnosed after January 1, 2007, from northern and central Taiwan and age and sex frequency–matched controls from the same geographic regions. Of 1,850 eligible cases and 1,885 controls, 1,600 cases (86%) and 1,804 controls (96%) agreed to participate.

For the current study, we included all individuals with DNA available from the first case–control study (241 cases and 234 controls), and all individuals from the initial phase of the second case–control study (578 cases and 704 controls), as samples were not available for the remaining participants. For cases and controls, separately, we noted a similar distribution of age and sex for those who were included and excluded from each study (P > 0.05). In total, we included 819 NPC cases and 938 controls. Characteristics of the NPC cases and controls are presented in Supplementary Table S1. Of 819 NPC cases, 256 were prevalent (i.e., recruited after treatment was initiated/completed) and 563 were incident (i.e., recruited at the time of diagnosis and prior to treatment initiation). Written informed consent was obtained from all participants, and both studies were approved by human subject committees in Taiwan and the United States.

Targeted sequencing pipeline

DNA was extracted from the peripheral blood lymphocytes using the QIAamp DNA Blood Mini Kit (Qiagen). The quantity and quality of the genomic DNA were assessed by Nanodrop 1000 (Thermo Fisher Scientific) and Qubit (Life Technologies), respectively. A targeted, multiplex PCR primer panel was designed using the custom Ion AmpliSeq Designer v4.2 (Life Technologies) targeting the coding regions of the 20 selected genes/gene regions. Sample DNA (30 ng) was amplified using this custom AmpliSeq primer pool, and libraries were prepared following the manufacturer's Ion AmpliSeq Library Preparation protocol (Life Technologies).

Raw sequencing reads generated by the Ion Torrent sequencer were quality and adaptor trimmed by Ion Torrent Suite and then aligned to the hg19 reference sequence by TMAP (https://github.com/iontorrent/TS/tree/master/Analysis/TMAP) using default parameters. Resulting BAM files were aligned using Genome Analysis Toolkit (GATK) LeftAlignIndels module. Amplicon primers were trimmed from aligned reads. Variant calls and filtrations were made by Torrent Variant Caller 5.0 and GATK (UnifiedGenotyper v3.1; ref. 8). Variant annotation was done by snpEff, SnpSift (http://snpeff.sourceforge.net/), and Annovar (9). On average, we achieved a read length of 224 bp (with a minimum of 198 bp) across the samples, with an average of mean depth of 1,113× (interquartile range: 849×–1,296×). Insertions and deletions (INDEL) were not called because of the potential high false positive rate of INDELs calling for the data generated from the Ion Torrent sequencer. A total of 1,966 variants were identified. Genotypes with read depth (DP) < 8 or genotype quality (GQ) < 20 were considered as ambiguous calls and therefore were considered as missing values. Variants missing in more than 5% of the samples in cases and controls were excluded (N = 120). We excluded multiallelic variants (172 SNPs) and SNPs with Hardy–Weinberg equilibrium P < 10−4 (N = 2). Six candidate SNPs, which were significantly associated with NPC in a previous genome-wide association study (GWAS) pooled analysis (6), were interrogated with the Ion Torrent panel; as compared with the variant calling data, no discrepancy was observed.

Statistical analysis

We first focused on genes previously identified from a study of NPC multiplex families (N = 17) to test the hypothesis that genes involved in familial NPC are also associated with sporadic NPC. We then focused on genes associated with sporadic NPC from a previously reported NPC GWAS (N = 4; 1 overlapped with the list from the family study) both to confirm previous findings and to determine whether there is support for an association with NPC for additional variants within those genes.

For gene-based analyses, we used the Sequence Kernel Association combined sum test (SKAT-C) to estimate the effect of common variants and combined effect of common (minor allele frequency, MAF, ≥1% among all study subjects) and rare (MAF < 1% among all study subjects) variants with NPC risk (10). To evaluate the association between rare variants alone and NPC risk, we used the SKAT optimal adjusted test (SKAT-O) that encompasses SKAT and the burden test. All SKAT analyses were performed using default weights, adjusting for sex and age groups (in 10-year category). For single-variant analyses (restricted to common variants), associations between SNPs and NPC were analyzed by logistic regression under a log additive model, adjusting for study, sex, and age groups (in 10-year category).

We considered a gene-based association significant if the nominal P value was < 2.5 × 10−3, corresponding to a Bonferroni correction for 20 genes evaluated. For the four SNPs with previously reported genome-wide significance (P < 5 × 10−8; ref. 6), we considered P value of <0.05 as significant. For other SNPs, we considered a single-variant association significant if the nominal P value was < 2.7 × 10−4, corresponding to a Bonferroni correction for 187 tests (total number of common variants evaluated). Sensitivity analysis was conducted among incident cases and all controls.

On the basis of the current sample size (∼800 cases and ∼900 controls), after correction for multiple tests (significant level = 2.7 × 10−4), we had at least 80% power to detect an OR as low as 3.4, 1.9, and 1.6, corresponding to MAF of 1%, 5%, and 10%, respectively. Because of a modest sample size, we have a limited power to identify disease association for rare variants.

Number of variants

From the selected 20 genes/gene regions, we identified a total of 1,670 variants among 1,757 participants (819 NPC; 938 controls), of which 187 were common (MAF ≥ 1%).

Gene-based analyses

The association between the combined effect of rare and common variants and NPC risk is summarized in Table 1. For genes previously identified from a study of multiplex families, the most significant association was observed for BRD2 located on chromosome 6 (PSKAT-C = 0.0016). A suggestive association was observed for CLPTM1L/TERT located on chromosome 5 (PSKAT-C = 0.0054). In analyses restricted to rare variants (MAF < 1%), however, no evidence for significant associations for these two genes was noted, suggesting that the observed association was more likely explained by common variants within these gene regions. We observed a suggestive association with NPC for rare variants in HNRNPU located on chromosome 1 (PSKAT-O = 0.0038).

For genes previously identified from a meta-analysis of NPC GWAS, the most significant association was observed for CDKN2A/2B located on chromosome 9 (PSKAT-C = 1.3 × 10−4), followed by a suggestive association for TNFRSF19 located on chromosome 13 (PSKAT-C = 0.0040). Again, no evidence for significant associations for these two genes was noted in analyses restricted to rare variants.

SNP-based analyses

The association between individual common SNPs and NPC risk is summarized in Table 2 and Supplementary Tables S2 and S3. For SNPs from genes previously identified from a multiplex family study, suggestive evidence in support of an association with NPC was observed for rs78231671 (P = 4.7 × 10−4, Table 2), an intronic SNP in the PRKDC gene region, and for rs76146382 (P = 5.8 × 10−4) in the BRD2 gene region.

We next evaluated the four SNPs that were previously reported to be significantly associated with NPC in a meta-analysis (6). All four SNPs, located within the CLPTM1L/TERT, CDKN2A/2B, TNFRSF19, and MECOM genes/gene regions, were significantly associated with NPC, and the direction of the association was the same as previously reported (P < 0.05, Table 2). Of an additional 29 SNPs evaluated within these four genes (Supplementary Table S2), 10 were in suggestive linkage disequilibrium (LD) with the primary a priori SNPs evaluated (r2 > 0.20 with the primary a priori SNP, Supplementary Fig. S1A–S1D). Of the remaining 19 SNPs, we found suggestive evidence for an association with NPC for one SNP within CLPTM1L/TERT, rs13167280 (P = 0.012, Supplementary Table S2), and one SNP within the MECOM region, rs17466625 (P = 0.011).

The association between other common SNPs and NPC risk is summarized in Supplementary Table S3. Associations of similar magnitude were observed in analysis among incident cases and controls (Supplementary Tables S4 and S5).

In this targeted gene–sequencing study, we extended our findings from a study of NPC multiplex families aimed at identifying genes associated with familial NPC risk by showing that two genes, CLPTM1L/TERT and BRD2, are also associated with sporadic NPC risk in Taiwan. A suggestive association was observed for HNRNPU. In addition, we confirmed results from a recent meta-GWAS of NPC by showing that four previously reported SNPs in genes CDKN2A/2B, MECOM, TNFRSF19, and CLPTM1L/TERT locus also conferred risk for sporadic NPC in this Taiwanese sample.

The biological implication for genes CDKN2A/2B, TNFRSF19, and CLPTM1L/TERT in NPC pathogenesis have been discussed previously (6, 11). These genes are involved in cell growth control (CDKN2A/2B and TNFRSF19) and telomere length maintenance (CLPTM1L/TERT). Both BRD2 and HNRNPU are also involved in telomere biology (12, 13). BRD2 is one of the host chromatin factors interacting with the Kaposi sarcoma–associated herpesvirus latency–associated nuclear antigen 1 that affects chromatin structure and maintenance of the telomere repeat (14). Of note, the most significantly associated SNP within BRD2 was rs76146382. This SNP is in LD with an extended HLA haplotype known to be associated with NPC, HLA-A*3303∼B*5801∼DRB1*0301 (r2 > 0.6; refs. 7, 15). It remains to be determined whether the association with BRD2 observed in this study is explained by or independent of this previously reported HLA association. The role of HNRNPU in NPC pathogenesis is not clear. Overexpression of HNRNPU has been associated with telomere shortening (12). In addition, HNRNPU shows a strong homology with sequence of EBV nuclear antigen 1 (EBNA1), which is important for NPC development (16). Whether such similarity would have a biological implication is unclear. Taken together, although other mechanisms merit further investigation, our findings further support a role for telomere maintenance in NPC pathogenesis.

Variants in other genes previously linked to familial NPC were not associated with the risk of sporadic NPC. This finding could be due to the fact that genes involved in familial NPC pathogenesis differ from genes in sporadic NPC pathogenesis or to false positive findings in our previous familial NPC study. In addition, false negative findings cannot be ruled out given the modest sample size and power of our study to identify disease associations for rare variants that occur with very low frequency in the population and account for a small proportion of NPC cases. More studies are needed to confirm our findings.

In summary, variants in several genes previously linked to familial NPC were associated with the risk of sporadic NPC, and the risk was more likely to be driven by common rather than rare variants within these genes. We also confirmed previously reported associations from NPC GWAS for SNPs in four genes. Our findings highlight the important role for telomere length maintenance in NPC pathogenesis.

No potential conflicts of interest were disclosed.

The funding organization played no role in the study design, collection, management, analysis, and interpretation of the data; or preparation, review, and approval of the manuscript.

Conception and design: Z. Liu, A.M. Goldstein, K.J. Yu, C.-H. Hua, T.-L. Yang, C.K. Hsiao, P.-J. Lou, C.-J. Chen, A. Hildesheim

Development of methodology: Z. Liu, K.J. Yu, T.-L. Yang, K. Jones, G. Yu, C.-J. Chen, A. Hildesheim

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): W.-L. Hsu, K.J. Yu, Y.-C. Chien, J.J.-M. Jian, Y.-A. Tsou, L.-J. Liao, Y.-L. Chang, C.-P. Wang, J.-S. Wu, J.-C. Lee, T.-L. Yang, M.-S. Wu, M.-H. Tsai, K.-K. Huang, M. Yeager, G. Yu, P.-J. Lou, C.-J. Chen, A. Hildesheim

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): Z. Liu, C.-P. Wang, T.-L. Yang, K. Yu, B. Zhu, G. Yu, P.-J. Lou, C.-J. Chen, A. Hildesheim

Writing, review, and/or revision of the manuscript: Z. Liu, A.M. Goldstein, W.-L. Hsu, K.J. Yu, L.-J. Liao, C.-P. Wang, T.-L. Yang, C.K. Hsiao, K. Yu, M. Yeager, P.-J. Lou, C.-J. Chen, A. Hildesheim

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): W.-L. Hsu, K.J. Yu, Y.-C. Chien, J.-Y. Ko, J.J.-M. Jian, Y.-A. Tsou, L.-J. Liao, C.-P. Wang, C.-H. Hua, T.-L. Yang, C.K. Hsiao

Study supervision: Y.-A. Tsou, Y.-S. Leu, C.-H. Hua, T.-L. Yang, C.-J. Chen, A. Hildesheim

This research was supported by the NCI Intramural Research Program and research grants from Academia Sinica, Taipei, Taiwan.

The GEV-NPC Study Group includes Genomics Research Center, Academia Sinica, Taipei, Taiwan (Wan-Lun Hsu, Yin-Chu Chien, Chien-Jen Chen); Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD (Allan Hildesheim, Zhiwei Liu, Kelly J. Yu); Department of Otolaryngology, National Taiwan University Hospital and National Taiwan University, College of Medicine, Taipei, Taiwan (Jenq-Yuh Ko, Cheng-Ping Wang, Tsung-Lin Yang, Pei-Jen Lou, Chun-Nan Chen, Tseng-Cheng Chen, Chih-Feng Lin); Department of Radiation Oncology, Koo Foundation Sun Yat-Sen Cancer Center, Taipei, Taiwan (James Jer-Min Jian, Skye Hongiun Cheng, Yu-Chen Tsai, Yih-Lin Chung, Jia-Shing Wu, Ming-Jiung Liu, Kuei-Kang Huang); Department of Pathology and Laboratory Medicine, Koo Foundation Sun Yat-Sen Cancer Center, Taipei, Taiwan (Mei-Hua Tsou); Department of Hematology and Medical Oncology, Koo Foundation Sun Yat-Sen Cancer Center, Taipei, Taiwan (Hsin-Hsuan Chen); Division of Otolaryngology Head and Neck Surgery (Ching-Yuan Lin, Shyuang-Der Terng, Fang-Yin Lin, Hsin-I Huang); Department of Otorhinolaryngology, China Medical University Hospital, Taichung, Taiwan (Yung-An Tsou, Chun-Hung Hua, Ming-Hsui Tsai); Department of Otolaryngology, MacKay Memorial Hospital, Taipei, Taiwan (Yi-Shing Leu, Jehn-Chuan Lee); Department of Otolaryngology, Far Eastern Memorial Hospital, New Taipei City, Taiwan (Li-Jen Liao); Department of Otolaryngology, Cathay General Hospital, Taipei, Taiwan (Yen-Liang Chang); Graduate Institute of Epidemiology and Preventative Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan (Chuhsing Kate Hsiao); Department of Internal Medicine, National Taiwan University Hospital, Taipei, Taiwan (Ming-Shiang Wu).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Pathmanathan
R
,
Prasad
U
,
Sadler
R
,
Flynn
K
,
Raab-Traub
N
. 
Clonal proliferations of cells infected with Epstein-Barr virus in preinvasive lesions related to nasopharyngeal carcinoma
.
N Engl J Med
1995
;
333
:
693
8
.
2.
Niedobitek
G
. 
Epstein-Barr virus infection in the pathogenesis of nasopharyngeal carcinoma
.
Mol Pathol
2000
;
53
:
248
54
.
3.
Bei
JX
,
Jia
WH
,
Zeng
YX
. 
Familial and large-scale case-control studies identify genes associated with nasopharyngeal carcinoma
.
Semin Cancer Biol
2012
;
22
:
96
106
.
4.
Hildesheim
A
,
Wang
CP
. 
Genetic predisposition factors and nasopharyngeal carcinoma risk: a review of epidemiological association studies, 2000–2011: Rosetta Stone for NPC: genetics, viral infection, and other environmental factors
.
Semin Cancer Biol
2012
;
22
:
107
16
.
5.
Yu
G
,
Hsu
WL
,
Coghill
AE
,
Yu
KJ
,
Wang
CP
,
Lou
PJ
, et al
Whole exome sequencing of nasopharyngeal carcinoma families reveals novel variants potentially involved in nasopharyngeal carcinoma
.
Sci Rep
2019
;
9
:
9916
.
6.
Bei
JX
,
Su
WH
,
Ng
CC
,
Yu
K
,
Chin
YM
,
Lou
PJ
, et al
A GWAS meta-analysis and replication study identifies a novel locus within CLPTM1L/TERT associated with nasopharyngeal carcinoma in individuals of Chinese ancestry
.
Cancer Epidemiol Biomarkers Prev
2016
;
25
:
188
92
.
7.
Hildesheim
A
,
Apple
RJ
,
Chen
CJ
,
Wang
SS
,
Cheng
YJ
,
Klitz
W
, et al
Association of HLA class I and II alleles and extended haplotypes with nasopharyngeal carcinoma in Taiwan
.
J Natl Cancer Inst
2002
;
94
:
1780
9
.
8.
DePristo
MA
,
Banks
E
,
Poplin
R
,
Garimella
KV
,
Maguire
JR
,
Hartl
C
, et al
A framework for variation discovery and genotyping using next-generation DNA sequencing data
.
Nat Genet
2011
;
43
:
491
8
.
9.
Wang
K
,
Li
M
,
Hakonarson
H
. 
ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data
.
Nucleic Acids Res
2010
;
38
:
e164
.
10.
Ionita-Laza
I
,
Lee
S
,
Makarov
V
,
Buxbaum
JD
,
Lin
X
. 
Sequence kernel association tests for the combined effect of rare and common variants
.
Am J Hum Genet
2013
;
92
:
841
53
.
11.
Bei
JX
,
Li
Y
,
Jia
WH
,
Feng
BJ
,
Zhou
G
,
Chen
LZ
, et al
A genome-wide association study of nasopharyngeal carcinoma identifies three new susceptibility loci
.
Nat Genet
2010
;
42
:
599
603
.
12.
Fu
D
,
Collins
K
. 
Purification of human telomerase complexes identifies factors involved in telomerase biogenesis and telomere length regulation
.
Mol Cell
2007
;
28
:
773
85
.
13.
Deng
Z
,
Wang
Z
,
Lieberman
PM
. 
Telomeres and viruses: common themes of genome maintenance
.
Front Oncol
2012
;
2
:
201
.
14.
Ballestas
ME
,
Kaye
KM
. 
The latency-associated nuclear antigen, a multifunctional protein central to Kaposi's sarcoma-associated herpesvirus latency
.
Future Microbiol
2011
;
6
:
1399
413
.
15.
Gourraud
PA
,
Khankhanian
P
,
Cereb
N
,
Yang
SY
,
Feolo
M
,
Maiers
M
, et al
HLA diversity in the 1000 genomes dataset
.
PLoS One
2014
;
9
:
e97282
.
16.
Fischer
N
,
Voss
MD
,
Mueller-Lantzsch
N
,
Grasser
FA
. 
A potential NES of the Epstein-Barr virus nuclear antigen 1 (EBNA1) does not confer shuttling
.
FEBS Lett
1999
;
447
:
311
4
.