This study was aimed to identify novel susceptibility variants for second primary tumor (SPT) or recurrence in curatively treated early-stage head and neck squamous cell carcinoma (HNSCC) patients.

We constructed a custom chip containing a comprehensive panel of 9,645 chromosomal and mitochondrial single nucleotide polymorphisms (SNP) representing 998 cancer-related genes selected by a systematic prioritization schema. Using this chip, we genotyped 150 early-stage HNSCC patients with and 300 matched patients without SPT/recurrence from a prospectively conducted randomized trial and assessed the association of these SNPs with risk of SPT/recurrence.

Individually, six chromosomal SNPs and seven mitochondrial SNPs were significantly associated with risk of SPT/recurrence after adjustment for multiple comparisons. A strong gene-dosage effect was observed when these SNPs were combined, as evidenced by a progressively increasing SPT/recurrence risk as the number of unfavorable genotypes increased (P for trend < 1.00 × 10−20). Several polygenic analyses suggest an important role of interconnected functional network and gene-gene interaction in modulating SPT/recurrence. Furthermore, incorporation of these genetic markers into a multivariate model improved significantly the discriminatory ability over the models containing only clinical and epidemiologic variables.

This is the first large-scale systematic evaluation of germ-line genetic variants for their roles in HNSCC SPT/recurrence. The study identified several promising susceptibility loci and showed the cumulative effect of multiple risk loci in HNSCC SPT/recurrence. Furthermore, this study underscores the importance of incorporating germ-line genetic variation data with clinical and risk factor data in constructing prediction models for clinical outcomes.

Approximately 10% of early-stage head and neck squamous cell carcinoma (HNSCC) patients develop locoregional recurrence and 15% to 25% develop second primary tumors (SPT) within 5 years of initial diagnosis (1, 2). As diagnostic and therapeutic approaches continue to improve, the ability to accurately predict SPT/recurrence in early-stage HNSCC patients would facilitate intensive surveillance or targeted interventions for high-risk patients and thereby reduce mortality and morbidity.

Clinical (index tumor site and disease stage) and lifestyle (continued smoking and alcohol drinking) factors contribute to the risk of SPT and recurrence (3, 4). HNSCC tumorigenesis is a multistep process involving an accumulation of progressive genetic alterations (5), including genomic alterations of multiple chromosomes (3p, 9p, 13q, and 17p; refs. 6, 7) and mutations of essential oncogenes and tumor suppressor genes (p53, p16, cyclin D1, KRAS, and FHIT; refs. 8, 9). Many of these somatic alterations have also been linked to SPT/recurrence development.

We previously reported that high mutagen sensitivity measured by an in vitro lymphocytic assay, reflecting constitutional genetic instability, was associated with increased risk of SPT/recurrence (10, 11). Although the association between single nucleotide polymorphisms (SNP) and risk of HNSCC (12, 13) has been extensively investigated, no studies have investigated their association with SPT/recurrence. To address this issue, we conducted this nested case-control analysis to test the hypothesis that common sequence variants affect the risk of SPT/recurrence in curatively treated HNSCC patients. Because genome-wide scanning approach was not an option due to the limited sample availability of HNSCC patients who developed SPT/recurrence, we therefore constructed a comprehensive panel of 998 cancer-related genes and 9,645 SNPs to assess both their individual and combined effects on SPT/recurrence. We also constructed risk prediction models of SPT/recurrence based on known clinical and epidemiologic risk factors, and SNPs identified from this study.

Study population and epidemiologic data

The subjects included in this study were participants enrolled (1991-1999) in the Retinoid Head and Neck Second Primary Trial designed to evaluate whether daily low-dose 13-cis-retinoic acid (13-cRA) prevents SPT or tumor recurrence in early-stage HNSCC patients (1). Briefly, patients with histologically confirmed stage I or II HNSCC who were cancer-free for at least 16 wk after the end of treatment were eligible for randomization to either low-dose (30 mg/d) 13-cRA treatment or placebo for 3 y with a minimum of planned 4 y of follow-up. The stratification criteria for randomization included the primary tumor site (larynx, oral cavity, and pharynx), tumor stage (stage I or II), and smoking status (current, former, or never smoker). Never smokers were individuals who had smoked <100 total cigarettes during their lifetime. Former smokers were individuals who had stopped smoking for at least 1 y at the time of enrollment (14). Patients were evaluated at 3, 6, 9, 12, 16, 20, 24, 28, 32, and 36 mo after randomization. After completing treatment, patients were followed up at 6-mo intervals for an additional 4 y. Standard criteria for diagnosis of a SPT were applied (15). The major sites of SPT in this population were lung (29.8%), head and neck (28.0%), prostate (14.2%), and bladder (5.1%). Local recurrence was defined as any tumor of similar histology appearing within 2 cm or within 3 y of the primary tumor. Among ∼1,190 patients enrolled, 354 developed SPT/recurrence. However, only 150 patients have blood DNA samples available. Therefore, we designed a nested case-control study to evaluate these 150 patients with SPT/recurrence designated as cases and 300 patients without SPT/recurrence as controls. We did analyses on these 150 cases and those not included in this study and did not find significant differences in terms of age, sex, smoking, alcohol, tumor site, stage, radiotherapy, surgery, or 13-cRA treatment. Patients included in the study had higher percentage of Caucasians (95%) than patients not included (89%; P = 0.001). We are confident that there is minimal patient selection bias. The study was approved by the Institutional Review Board of The University of Texas M. D. Anderson Cancer Center. Informed consent was obtained from all participants.

Development of the iSelect Infinium II cancer gene/SNP BeadChip

We developed a customized and comprehensive panel of cancer-related genes involved in 12 major cellular pathways (Supplementary Table S1). For each specific pathway, genes were subcategorized according to their major reported functions. To generate an unbiased relevant gene list, we used the Gene Ontology (GO),5

a comprehensive database of gene annotation. We further used the Cancer Genome Anatomy Project GO Browser6 to pinpoint all relevant ontology terms for probing the GO database. We did an extensive literature review on the genes returned by the GO database using the HUGO name and the common aliases7 and “cancer” as keywords to interrogate the PubMed to further scrutinize for cancer relevance. We then assigned a priority score to each gene based on the importance and relevance of the gene to the specific cancer pathway. For each gene with a high priority score, we identified the tagSNPs ranging from 10 kb upstream of the 5′-untranslated region (UTR) to 10 kb downstream of the 3′-UTR of the gene (16). We also included potentially functional SNPs, which are located in the functional regions of the genes, including coding (synonymous SNPs and nonsynonymous SNPs) and regulatory (promoter, splicing site, 5′-UTR, and 3′-UTR) regions. Each gene was then analyzed using the LDSelect program8 to divide SNPs into bins based on the r2 threshold of 0.8 and minor allele frequency (MAF) ≥0.05 in Caucasians. For genes with a medium priority score, only potentially functional SNPs were identified. For tagSNP selection, we selected one SNP from each bin according to preset criteria considering the validation status, designability score, position, and bead type number of specific SNPs. For potentially functional SNP selection, we included all two-hit or HapMap validated SNPs with a designability score ≥0.6 and a MAF ≥0.01 in Caucasians. Overall, 9,645 SNPs were included on the BeadChip (Supplementary Table S1). The complete set of selected SNPs was submitted to Illumina technical support for the Infinium II chemistry designability and bead type analyses using a proprietary program developed by Illumina (17).

Genotyping

Genomic DNA was extracted from peripheral blood lymphocytes. Genotyping was carried out according to the standard 3-d protocol provided by Illumina. The genotypes were autocalled using the BeadStudio software.

Statistical analysis

Statistical analyses were done using Intercooled STATA software (STATA Corp.) and SAS/Genetics, version 9.0 (SAS Institute). χ2 analysis was used to assess the differences between subject groups with regard to categorical variables and Student's t test for continuous variables. For each chromosomal SNP, the risks of SPT/recurrence were estimated as hazard ratios (HR) and 95% confidence intervals (95% CI) using multivariable Cox proportional hazard regression models adjusted for age, gender, ethnicity, smoking status, tumor site, stage, and treatment, where appropriate. Three genetic models (dominant, recessive, and additive) were tested for each SNP and the model with the highest significance was considered the best-fitting model and used to measure the statistical significance of each SNP (18). For mitochondrial SNPs (mtSNP), the heterozygous genotypes were treated as missing data because these calls typically result from either DNA contamination or heteroplasmy (19). The wild-type and variant genotypes of mtSNPs were then analyzed in the same way as chromosomal SNPs. Multiple hypothesis testing was done using the q value, a measure of significance in terms of the false discovery rate and implemented in the R package (20). The multiple comparison adjustment was carried out for the best-fitting model representing the significance of the association for each SNP. We applied a bootstrap resampling method to internally validate the results. We generated bootstrap 100 samples. Each time a bootstrap sample was drawn from the original data set and the P value was obtained for each SNP among the dominant, recessive, and additive models. The cumulative effects of unfavorable genotypes on SPT/recurrence were tested for the combined top SNPs that showed a significant q value (<0.05) and also had a bootstrap P value of <0.01 at least 80% times. Based on the percentage of patients developing SPT/recurrence, subjects were categorized into low-risk (<25%), medium low-risk (25-50%), medium high-risk (51-75%), and high-risk (>75%) groups by number of unfavorable genotypes. We calculated the HRs and 95% CIs for all other groups compared with the low-risk reference group using a multivariable Cox proportional hazard regression model. Kaplan-Meier estimates were calculated to plot the event-free curve for each group and the log-rank test was used to compare survival between these groups. We also constructed receiver operating characteristic curves and calculated the area under the curve (AUC) to evaluate the specificity and sensitivity of predicting SPT/recurrence by incorporating different combinations of epidemiologic, clinical, and genetic predictor variables. We only included SNPs internal validated by bootstrapping in these analyses. A two-sided P ≤ 0.05 was considered the threshold of statistical significance.

Characteristics of the study population

One hundred and fifty patients with SPT/recurrence (cases) were 1:2 matched to 300 patients without SPT/recurrence (controls) by age (±5 years), gender, and ethnicity (Supplementary Table S2). There were no significant differences between these two groups in radiotherapy (P = 0.71), surgery (P = 0.34), or 13-cRA treatment arm (P = 0.42). There seemed to be more current smokers (42%) in SPT/recurrence group than in no event group (34%), and more high-stage (stage II) patients in the former group (41%) than the latter group (34%), although these two comparisons did not reach statistical significance (P = 0.22 and 0.13, respectively). However, significant differences were observed between the two groups in pack-years (P = 0.007) and tumor site (P = 6.0 × 10−5).

iSelect Infinium II BeadChip content and genotyping quality controls

There were 998 genes represented by 9,645 SNPs on the BeadChip (Supplementary Table S3). Seventy-eight percent were tagging SNPs and 22% were potentially functional SNPs. The initial conversion rate of the BeadChip synthesis was 90.61%, leaving 8,739 SNPs (8,583 chromosomal SNPs and 156 mtSNPs) with reliable genotyping data. Individuals with >5% missing genotypes, SNPs with >5% missing calls, chromosomal SNP with <1% MAF, or mtSNPs with <5% MAF were excluded. After applying these filters, 8,370 SNPs and 440 study subjects (147 cases and 293 controls) were included in the following analyses.

Significant individual SNPs associated with SPT/recurrence in the main effect analysis

Because the genetic background and replication patterns are significantly different for chromosomal SNPs and mtSNPs, we did analyses separately for these two groups. Table 1 lists the top 20 chromosomal SNPs sorted by P values. Six SNPs remained statistically significant after multiple comparison adjustment using q value (Table 1). The most significant SNP (rs12359892) was located in the 3′ region of the MKI67 gene. The homozygous variant genotype was associated with a 2.65-fold (95% CI 1.72-4.11; P = 1.25 × 10−5; q = 0.042) increased risk of SPT/recurrence under the recessive genetic model. Seven mtSNPs had significant q values after multiple comparison adjustment (Table 2). mitoA11813G located in the NADH dehydrogenase subunit 4 (ND4) gene was the most significant mtSNP. The HR of the variant allele was 0.06 (95% CI, 0.01-0.44; P = 1.24 × 10−6; q = 1.98 × 10−5) compared with the wild-type allele. We then did bootstrap 100 times for internal validation and listed the number of times that the bootstrap P value was <0.01 for each SNP (Tables 1 and 2). For the top 20 chromosomal SNPs, 12 had a bootstrap P value of <0.01 at least 80% times (Table 1, shaded SNPs). The top SNP, MKI67 rs12359892, exhibited a highly consistent result with P < 0.01 96 times in 100 bootstrap samples (Table 1). The top three mtSNPs had a bootstrap P value of <0.01 at least 80% times (Table 2). The top mtSNP, mitoA11813G, exhibited a highly consistent result with a bootstrap P value of <0.01 for 98 times.

Table 1.

Associations of SPT/recurrence with the top 20 chromosomal SNPs

graphic
graphic
 
graphic
graphic
 
Table 2.

Associations of SPT/recurrence with mtSNPs remaining significant after multiple comparison adjustment

SNPHost gene/regionSNP typeMitochondrial positionAllelic changeGenotype counts*Cox modelNo. times in bootstrap sample
SPTNo SPTHR (95% CI)PqP < 0.01
mitoA11813G Mt-ND4 sSNP 11812 A>G 146/1 259/34 0.06 (0.01-0.44) 1.24 × 10−6 1.98 × 10−5 98 
mitoG15929A Mt-TT Noncoding 15928 G>A 143/4 252/39 0.20 (0.08-0.56) 5.47 × 10−5 4.37 × 10−4 94 
mitoA14906G Mt-CYB sSNP 14905 A>G 141/6 248/41 0.28 (0.12-0.63) 1.95 × 10−4 1.04 × 10−3 83 
mitoT10464C Mt-TR Noncoding 10463 T>C 140/7 253/39 0.34 (0.16-0.73) 1.04 × 10−3 4.17 × 10−3 73 
mitoA11252G Mt-ND4 sSNP 11251 A>G 132/14 230/60 0.47 (0.27-0.82) 3.21 × 10−3 1.03 × 10−2 68 
mitoG3012A Mt-RNR2 Noncoding 3010 G>A 101/46 237/56 1.73 (1.21-2.49) 3.87 × 10−3 1.03 × 10−2 70 
mitoT14767C Mt-CYB Thr > Ile 14766 T>C 60/87 166/127 1.60 (1.14-2.25) 5.90 × 10−3 1.35 × 10−2 67 
SNPHost gene/regionSNP typeMitochondrial positionAllelic changeGenotype counts*Cox modelNo. times in bootstrap sample
SPTNo SPTHR (95% CI)PqP < 0.01
mitoA11813G Mt-ND4 sSNP 11812 A>G 146/1 259/34 0.06 (0.01-0.44) 1.24 × 10−6 1.98 × 10−5 98 
mitoG15929A Mt-TT Noncoding 15928 G>A 143/4 252/39 0.20 (0.08-0.56) 5.47 × 10−5 4.37 × 10−4 94 
mitoA14906G Mt-CYB sSNP 14905 A>G 141/6 248/41 0.28 (0.12-0.63) 1.95 × 10−4 1.04 × 10−3 83 
mitoT10464C Mt-TR Noncoding 10463 T>C 140/7 253/39 0.34 (0.16-0.73) 1.04 × 10−3 4.17 × 10−3 73 
mitoA11252G Mt-ND4 sSNP 11251 A>G 132/14 230/60 0.47 (0.27-0.82) 3.21 × 10−3 1.03 × 10−2 68 
mitoG3012A Mt-RNR2 Noncoding 3010 G>A 101/46 237/56 1.73 (1.21-2.49) 3.87 × 10−3 1.03 × 10−2 70 
mitoT14767C Mt-CYB Thr > Ile 14766 T>C 60/87 166/127 1.60 (1.14-2.25) 5.90 × 10−3 1.35 × 10−2 67 

Abbreviations: TT, tRNA threonine; TR, tRNA arginine.

*Genotype counts: wild genotype/variant genotype.

Adjusted for age, gender, ethnicity, smoking status, tumor site, tumor stage, and treatment.

To increase sample size and statistical power, we grouped all SPT cases in our analysis. Because the relevance of prostate cancer and other non–smoking-related or nonaerodigestive tract cancer as SPT may not be clear, we also did separate analyses of smoking-related and aerodigestive SPT and compared the results to the entire SPT group. Of the top 20 chromosomal SNPs that were significant in the entire SPT cases (Table 1), 18 remained significant at significance level 0.05 in both smoking-related and aerodigestive tract SPT subgroup analyses, one SNP remained significant at the significance levels 0.05 in smoking related and borderline significance in aerodigestive tract SPT, and the remaining SNP had a P value of 0.11 when considering smoking-related SPT cases and P value of 0.15 when considering aerodigestive tract SPT cases. The HR estimates were similar and the best-fitting models were the same for the top 20 chromosomal SNPs (Supplementary Table S4). A similar pattern was observed for the top mtSNPs (Supplementary Table S5). We chose to present data from the entire SPT cases to reflect general risk for developing any new tumors.

Cumulative effects of the unfavorable genotypes

We further evaluated the cumulative effects of the high-risk genotypes on SPT/recurrence by summing the unfavorable genotypes of the above-described top risk-conferring chromosomal SNPs and mtSNPs that had bootstrap P values of <0.01 at least 80% times. Twelve chromosome SNPs and 1 mtSNP (mitoG15929A and mitoA14906G were excluded because of high linkage disequilibrium with mitoA11813G) were included in this analysis. As shown in Table 3, there was a significant gene-dosage effect. Compared with those in the low-risk reference group (≤4 unfavorable genotypes), subjects with medium low risk (5-6 unfavorable genotypes), medium high risk (7), and high risk (≥8) had 4.29-fold (95% CI, 2.52-7.29; P = 7.59 × 10−8), 9.16-fold (95% CI, 5.52-17.83; P = 1.80 × 10−14), and 26.72-fold (95% CI, 14.00-50.99; P < 1 × 10−20) increased SPT/recurrence risks, respectively (P for trend < 1 × 10−20). The event-free median survival times were 14.6, 49.2, and 79.4 months for these three risk groups, respectively, compared with >93.0 months for the low-risk groups (P = 9.92 × 10−38, log-rank test; Fig. 1).

Table 3.

The cumulative effects of unfavorable genotypes on SPT/recurrence

No. unfavorable genotypes*SPT/recurrence, n (%)No SPT/recurrence, n (%)HR (95% CI)P
Reference group ≤4 18 (10.91) 147 (89.09) Reference 
5-6 62 (37.58) 103 (62.42) 4.29 (2.52-7.29) 7.59 × 10−8 
34 (61.82) 21 (38.18) 9.16 (5.52-17.83) 1.80 × 10−14 
≥8 25 (96.15) 1 (3.85) 26.72 (14.00-50.99) <1.00 × 10−20 
P for trend    <1.00 × 10−20 
No. unfavorable genotypes*SPT/recurrence, n (%)No SPT/recurrence, n (%)HR (95% CI)P
Reference group ≤4 18 (10.91) 147 (89.09) Reference 
5-6 62 (37.58) 103 (62.42) 4.29 (2.52-7.29) 7.59 × 10−8 
34 (61.82) 21 (38.18) 9.16 (5.52-17.83) 1.80 × 10−14 
≥8 25 (96.15) 1 (3.85) 26.72 (14.00-50.99) <1.00 × 10−20 
P for trend    <1.00 × 10−20 

*Unfavorable genotype was based on the 12 chromosomal SNPs and 1 mtSNP as described in text.

Adjusted for age, gender, smoking status, ethnicity, tumor site, tumor stage, and treatment.

Fig. 1.

Kaplan-Meier event-free survival curve on SPT/recurrence by the unfavorable genotypes of 18 chromosomal SNPs and 3 mtSNPs.

Fig. 1.

Kaplan-Meier event-free survival curve on SPT/recurrence by the unfavorable genotypes of 18 chromosomal SNPs and 3 mtSNPs.

Close modal

Model discrimination ability

We next constructed prediction models by incorporating established prognostic clinical variables (tumor site, stage, and treatment), epidemiologic variables (smoking pack-years), and genetic variables (12 chromosomal SNPs and 1 mtSNP identified in this study; Fig. 2). The AUC increased from 0.61 (clinical variables only) to 0.64 (clinical-smoking variables) and to 0.84 (clinical, smoking, and genetic variables). The observed difference in AUC between the third and second models was 0.20, and the bias-corrected 95% CIs based on 10,000 bootstrap samples were 0.15 to 0.27, suggesting significant differences between these two models.

Fig. 2.

Receiver operating characteristic curves from various models showing improvement of discrimination ability.

Fig. 2.

Receiver operating characteristic curves from various models showing improvement of discrimination ability.

Close modal

Because age, gender, and ethnicity were matched by study design, the above models may be weak in terms of epidemiologic risk factors. However, we analyzed the entire cohort data to explore the main effects of age, gender, and ethnicity on SPT/recurrence and constructed receiver operating characteristic curve based on these data. We found a significant effect of age on SPT/recurrence, but neither sex nor ethnicity was significantly associated with SPT/recurrence. However, adding age to the clinical-smoking model did not significantly change the AUC of the clinical-smoking model (data not shown).

In this large-scale systematic evaluation of 9,645 SNPs in 998 cancer-related genes, we identified six chromosomal SNPs and seven mtSNPs significantly associated with risk of SPT/recurrence after correction for type I errors, with evidence of a significant gene-dosage effect. These results support the notion that SPT and tumor recurrences are polygenic traits determined by multiple low penetrance loci.

We developed a customized SNP chip encompassing well-established pathways through comprehensive and exhaustive database interrogation and literature review. The associations identified are biologically plausible. Among the six significant chromosomal variants, the most significant is localized in the MKI67 gene, an important cell cycle proliferation marker whose expression is correlated with the development and progression of various malignancies, including HNSCC (21). Cyclin-dependent kinase (CDK) 6 mostly functions in the progression of G1 phase through interacting with multiple cyclins and inhibiting tumor suppressor protein RB (22). Both CDK6 and MKI67 are reported to promote HNSCC progression through enhancing expression of protein kinases to phosphorylate and activate proliferative transcription factors (23). MNAT1 is a key component of the protein complex CDK-activating kinase, which phosphorylates CDKs to activate cell cycle progression and also interacts with transcription factor TFIIH to stimulate nucleotide excision repair (24). NHEJ1 gene product interacts with both XRCC4 and LIG4 as a core component of the protein complex responsible for nonhomologous end-joining pathway of dsDNA break repair (25). Suboptimal DNA repair capacity has been shown to increase the risk of HNSCC and SPT/recurrence (10, 11). TNFRSF10B encodes a member of the tumor necrosis factor (TNF) receptor superfamily involved in extrinsic apoptosis pathway (26). Mutations in TNFRSF10B have been identified in multiple cancers, including HNSCC (10). GSTM4 belongs to the Mu subclass of the glutathione S-transferase family, essential in the detoxification of electrophilic compounds, and polymorphisms of this gene family have been extensively associated with the risk and outcomes of HNSCC (27, 28). Taken together, there is strong biological plausibility for the associations between the six identified chromosomal genes and HNSCC.

We also identified several mtSNPs as predictors of HNSCC SPT/recurrence. Mitochondrial dysfunction may lead to tumorigenesis through apoptotic regulation, reactive oxygen species generation, metabolic regulation, and nucleus-mitochondria communications (29). Altered mitochondrial function with increased aerobic glycolysis, the Warburg effect, is a common feature in many tumors (30). Aberrations of mitochondrial DNA have been observed in almost all types of solid cancers, including HNSCC (31). Polymorphisms in the mitochondrial genome have also been associated with many common diseases, including diabetes and cancer (32). The most significant mtSNP, mitoA11813G, is located in the ND4 gene, which has been implicated in head and neck cancer by multiple independent studies (33, 34). Mutations of cytochrome b (CYB) and 16s rRNA (RNR2) were also identified in HNSCC (31). mtSNPs may be involved in the initiation and progression of both index tumors and SPT/recurrence due to possible disruptive effects on mitochondria genes and energy metabolism (35) or related to the central role of mitochondria in apoptosis and reactive oxygen species production.

We further used Ingenuity Pathway Analysis to explore whether certain canonical pathways were overrepresented for significant associations by inputting chromosomal genes containing SNPs with P < 0.01 (a total of 170 genes; ref. 36). The top predefined canonical pathways to which these genes belong include aryl hydrocarbon receptor signaling, PTEN signaling, lipopolysaccharide/interleukin-1–mediated inhibition of retinoid X receptor function, xenobiotic metabolism signaling, and cell cycle (Supplementary Table S6), most of which are implicated in carcinogen or drug metabolism and treatment-related cellular response. Because of the etiologic role of tobacco and alcohol in HNSCC carcinogenesis, these results are not surprising. Most genetic markers of clinical outcome have only modest effects, and there is likely to be an enhanced predictive power when SNPs are analyzed jointly (18, 37, 38), as we noted. Another data-mining tool we explored is the survival tree analysis, which uses a binary recursive partitioning to produce a tree structure with many binary splits. Our survival tree analysis produced a decision tree with 14 terminal nodes, each with a different SPT/recurrence risk based on distinct combination of genotypes (Supplementary Fig. S1). The terminal nodes from the final tree were grouped into four risk groups based on the percentage of patients developing SPT/recurrence in each terminal node: low risk (<25%), medium low risk (25-50%), medium high risk (51-75%), and high risk (>75%). Compared with the low-risk group, the risk increased from 3.48- to 17.04-fold for medium low-risk to high-risk groups (Supplementary Fig. S1). We validated the risk groups by bootstrapping the samples 10,000 times. These data support an important role of gene-gene interactions in modulating SPT/recurrence. Furthermore, when we incorporated the genetic variables into a multivariate model, we obtained a significant improvement of discriminatory ability (Fig. 2), underscoring the importance of incorporating germ-line genetic variation data with clinical and risk factor data into prediction models for clinical outcomes.

There are also a few limitations of this study. First, the sample size is limited due to the rarity of events and availability of germ-line DNA. We calculated statistical power based on the MAF and genetic models (Supplementary Table S7). Power is adequate for additive and dominant models to detect an OR of ≥2.5 when MAF is >0.05. At a MAF of 0.05, we have more than 91% power and 94% power to detect an increased OR of 2.5 in dominant and additive models, respectively. The power to detect OR of 2.5 is close to 100% for larger MAFs. For a recessive model, we have >80% power to detect an increased OR of 3.0 when MAF is ≥0.20. However, power is limited when MAF is lower in recessive model. We calculated power to detect ORs instead of HRs. In cohort studies with long follow-up time, the HR approach based on survival analysis for time to event end point is even more efficient than the OR approach based on logistic regression for binary end point. Second, due to the sample size, we could not do stratified analyses, for example, on smoking and tumor site. Hence, we adjusted these variables in all our analyses. We also do not have information on human papillomavirus-16 status. Third, due to the difficulty in identifying an external validation population, we are unable to validate the significant SNPs in an independent population. Such external validation would be a critical next step. Finally, we used a nested 1:2 case-control study design, which may not reflect the population of early-stage HNSCC, although the 1:2 case-control ratio is comparable with the roughly 30% of SPT/recurrence incidence in the original population.

There are many strengths of this study. This is the first large-scale study to systematically evaluate germ-line genetic variants in HNSCC SPT/recurrence. Because a genome-wide scanning approach was not possible due to the limited numbers of HNSCC patients who developed SPT/recurrence, our pathway-based custom SNP array is the best option. There is minimal selection bias because the cases and controls were well matched and were all early-stage HNSCC patients enrolled in a prospectively conducted randomized chemoprevention trial. The significant SNPs identified may be useful for clinicians in assessing the risk for SPT/recurrence in early-stage HNSCC patients. The genotyping technology is robust and consistent. Obtaining DNA from peripheral blood is noninvasive and inexpensive. We can generate thousands of genotypes from one drop of blood and get the patients' genetic profile predictive of SPT/recurrence, which can be incorporated into a risk prediction model to identify high-risk patients to undergo intensive screening, smoking cessation, or dietary modification. Chemoprevention trials have been mostly negative in head and neck cancer. Although the main reason for these negative results probably is that the tested chemoprevention agents are not the best, we also think that patients are heterogeneous and these agents may not work in all patients. Not considering patients' genetic background in patient stratification may at least partially contribute to the negative results. Patients with a specific genetic background may respond better to certain chemoprevention agents.

The present study focused on comprehensive risk-modeling analyses of SNPs to identify early-stage head and neck HNSCC cancer patients at the highest risk of SPT/recurrence and conducted within a large-scale randomized trial of 13-cRA. Ongoing work that is beyond the scope of this article is examining pharmacogenetic interactions to see if there are certain germ-line alterations associated with a better outcome of 13-cRA treatment. This treatment was a covariate in the risk-modeling analysis, which was adjusted for this factor. We identified the top 20 chromosomal SNPs associated with a high risk of SPT/recurrence (Table 1); of these 20 SNPs, only 1, which is in MK167, a cell cycle gene, was associated with the retinoid effect of a significantly reduced SPT/recurrence risk (62%), making this SNP both highly prognostic and predictive (data not shown). This preliminary observation is advantageous in that it seems to mark high-risk patients with the greatest need and their sensitivity to an agent; it is being examined further in the broader pharmacogenomic studies mentioned above. If these studies identify a predictive marker or signature based on individual patients' germ-line genetic variations, we can design a better patient stratification plan in future chemoprevention trials, targeting chemoprevention agents to patients with a high risk of SPT/recurrent and more likely to benefit from treatment. Through this personalized chemoprevention, we may have better success in chemoprevention trials.

No potential conflicts of interest were disclosed.

1
Khuri
FR
,
Kim
ES
,
Lee
JJ
, et al
. 
The impact of smoking status, disease stage, and index tumor site on second primary tumor incidence and tumor recurrence in the head and neck retinoid chemoprevention trial
.
Cancer Epidemiol Biomarkers Prev
2001
;
10
:
823
9
.
2
Khuri
FR
,
Lee
JJ
,
Lippman
SM
, et al
. 
Randomized phase III trial of low-dose isotretinoin for prevention of second primary tumors in stage I and II head and neck cancer patients
.
J Natl Cancer Inst
2006
;
98
:
441
50
.
3
Perez-Ordonez
B
,
Beauchemin
M
,
Jordan
RC
. 
Molecular biology of squamous cell carcinoma of the head and neck
.
J Clin Pathol
2006
;
59
:
445
53
.
4
Bedi
GC
,
Westra
WH
,
Gabrielson
E
,
Koch
W
,
Sidransky
D
. 
Multiple head and neck tumors: evidence for a common clonal origin
.
Cancer Res
1996
;
56
:
2484
7
.
5
Mao
L
,
Hong
WK
,
Papadimitrakopoulou
VA
. 
Focus on head and neck cancer
.
Cancer Cell
2004
;
5
:
311
6
.
6
Maestro
R
,
Gasparotto
D
,
Vukosavljevic
T
,
Barzan
L
,
Sulfaro
S
,
Boiocchi
M
. 
Three discrete regions of deletion at 3p in head and neck cancers
.
Cancer Res
1993
;
53
:
5775
9
.
7
Bockmuhl
U
,
Wolf
G
,
Schmidt
S
, et al
. 
Genomic alterations associated with malignancy in head and neck cancer
.
Head Neck
1998
;
20
:
145
51
.
8
Izzo
JG
,
Papadimitrakopoulou
VA
,
Li
XQ
, et al
. 
Dysregulated cyclin D1 expression early in head and neck tumorigenesis: in vivo evidence for an association with subsequent gene amplification
.
Oncogene
1998
;
17
:
2313
22
.
9
El-Naggar
AK
,
Lai
S
,
Clayman
G
, et al
. 
Methylation, a major mechanism of p16/CDKN2 gene inactivation in head and neck squamous carcinoma
.
Am J Pathol
1997
;
151
:
1767
74
.
10
Pai
SI
,
Wu
GS
,
Ozoren
N
, et al
. 
Rare loss-of-function mutation of a death receptor gene in head and neck cancer
.
Cancer Res
1998
;
58
:
3513
8
.
11
Spitz
MR
,
Lippman
SM
,
Jiang
H
, et al
. 
Mutagen sensitivity as a predictor of tumor recurrence in patients with cancer of the upper aerodigestive tract
.
J Natl Cancer Inst
1998
;
90
:
243
5
.
12
Sturgis
EM
,
Castillo
EJ
,
Li
L
, et al
. 
Polymorphisms of DNA repair gene XRCC1 in squamous cell carcinoma of the head and neck
.
Carcinogenesis
1999
;
20
:
2125
9
.
13
Cheng
L
,
Sturgis
EM
,
Eicher
SA
,
Char
D
,
Spitz
MR
,
Wei
Q
. 
Glutathione-S-transferase polymorphisms and risk of squamous-cell carcinoma of the head and neck
.
Int J Cancer
1999
;
84
:
220
4
.
14
Leibovici
D
,
Grossman
HB
,
Dinney
CP
, et al
. 
Polymorphisms in inflammation genes and bladder cancer: from initiation to recurrence, progression, and survival
.
J Clin Oncol
2005
;
23
:
5746
56
.
15
Wu
X
,
Gu
J
,
Dong
Q
, et al
. 
Joint effect of mutagen sensitivity and insulin-like growth factors in predicting the risk of developing secondary primary tumors and tumor recurrence in patients with head and neck cancer
.
Clin Cancer Res
2006
;
12
:
7194
201
.
16
Carlson
CS
,
Eberle
MA
,
Rieder
MJ
,
Yi
Q
,
Kruglyak
L
,
Nickerson
DA
. 
Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium
.
Am J Hum Genet
2004
;
74
:
106
20
.
17
Steemers
FJ
,
Chang
W
,
Lee
G
,
Barker
DL
,
Shen
R
,
Gunderson
KL
. 
Whole-genome genotyping with the single-base extension assay
.
Nat Methods
2006
;
3
:
31
3
.
18
Zheng
SL
,
Sun
J
,
Wiklund
F
, et al
. 
Cumulative association of five genetic variants with prostate cancer
.
N Engl J Med
2008
;.
19
Saxena
R
,
de Bakker
PI
,
Singer
K
, et al
. 
Comprehensive association testing of common mitochondrial DNA variation in metabolic disease
.
Am J Hum Genet
2006
;
79
:
54
61
.
20
Storey
JD
,
Tibshirani
R
. 
Statistical significance for genomewide studies
.
Proc Natl Acad Sci U S A
2003
;
100
:
9440
5
.
21
Slootweg
PJ
,
Koole
R
,
Hordijk
GJ
. 
The presence of p53 protein in relation to Ki-67 as cellular proliferation marker in head and neck squamous cell carcinoma and adjacent dysplastic mucosa
.
Eur J Cancer B Oral Oncol
1994
;
30B
:
138
41
.
22
Wikman
H
,
Kettunen
E
. 
Regulation of the G1/S phase of the cell cycle and alterations in the RB pathway in human lung cancer
.
Expert Rev Anticancer Ther
2006
;
6
:
515
30
.
23
Santos
CR
,
Rodriguez-Pinilla
M
,
Vega
FM
, et al
. 
VRK1 signaling pathway in the context of the proliferation phenotype in head and neck squamous cell carcinoma
.
Mol Cancer Res
2006
;
4
:
177
85
.
24
Zhang
S
,
He
Q
,
Peng
H
,
Tedeschi-Blok
N
,
Triche
TJ
,
Wu
L
. 
MAT1-modulated cyclin-dependent kinase-activating kinase activity cross-regulates neuroblastoma cell G1 arrest and neurite outgrowth
.
Cancer Res
2004
;
64
:
2977
83
.
25
Ahnesorg
P
,
Smith
P
,
Jackson
SP
. 
XLF interacts with the XRCC4-DNA ligase IV complex to promote DNA nonhomologous end-joining
.
Cell
2006
;
124
:
301
13
.
26
Takeda
K
,
Stagg
J
,
Yagita
H
,
Okumura
K
,
Smyth
MJ
. 
Targeting death-inducing receptors in cancer therapy
.
Oncogene
2007
;
26
:
3745
57
.
27
Singh
M
,
Shah
PP
,
Singh
AP
, et al
. 
Association of genetic polymorphisms in glutathione S-transferases and susceptibility to head and neck cancer
.
Mutat Res
2008
;
638
:
184
94
.
28
Cabelguenne
A
,
Loriot
MA
,
Stucker
I
, et al
. 
Glutathione-associated enzymes in head and neck squamous cell carcinoma and response to cisplatin-based neoadjuvant chemotherapy
.
Int J Cancer
2001
;
93
:
725
30
.
29
Carew
JS
,
Huang
P
. 
Mitochondrial defects in cancer
.
Mol Cancer
2002
;
1
:
9
.
30
Warburg
O
. 
On the origin of cancer cells
.
Science
1956
;
123
:
309
14
.
31
Chatterjee
A
,
Mambo
E
,
Sidransky
D
. 
Mitochondrial DNA mutations in human cancer
.
Oncogene
2006
;
25
:
4663
74
.
32
Bai
RK
,
Leal
SM
,
Covarrubias
D
,
Liu
A
,
Wong
LJ
. 
Mitochondrial genetic background modifies breast cancer risk
.
Cancer Res
2007
;
67
:
4687
94
.
33
Fliss
MS
,
Usadel
H
,
Caballero
OL
, et al
. 
Facile detection of mitochondrial DNA mutations in tumors and bodily fluids
.
Science
2000
;
287
:
2017
9
.
34
Allegra
E
,
Garozzo
A
,
Lombardo
N
,
De Clemente
M
,
Carey
TE
. 
Mutations and polymorphisms in mitochondrial DNA in head and neck cancer cell lines
.
Acta Otorhinolaryngol Ital
2006
;
26
:
185
90
.
35
Wallace
DC
. 
Mitochondria and cancer: Warburg addressed
.
Cold Spring Harb Symp Quant Biol
2005
;
70
:
363
74
.
36
Calvano
SE
,
Xiao
W
,
Richards
DR
, et al
. 
A network-based analysis of systemic inflammation in humans
.
Nature
2005
;
437
:
1032
7
.
37
Gordon
MA
,
Gil
J
,
Lu
B
, et al
. 
Genomic profiling associated with recurrence in patients with rectal cancer treated with chemoradiation
.
Pharmacogenomics
2006
;
7
:
67
88
.
38
Wu
X
,
Gu
J
,
Wu
TT
, et al
. 
Genetic variations in radiation and chemotherapy drug action pathways predict clinical outcomes in esophageal cancer
.
J Clin Oncol
2006
;
24
:
3789
98
.