Abstract
Purpose: Problems in management of oral cancers or precancers include identification of patients at risk for metastasis, tumor recurrence, and second primary tumors or risk for progression of precancers (dysplasia) to cancer. Thus, the objective of this study was to clarify the role of genomic aberrations in oral cancer progression and metastasis.
Experimental Design: The spectrum of copy number alterations in oral dysplasia and squamous cell carcinomas (SCC) was determined by array comparative genomic hybridization. Associations with clinical characteristics were studied and results confirmed in an independent cohort.
Results: The presence of one or more of the chromosomal aberrations +3q24-qter, -8pter-p23.1, +8q12-q24.2, and +20 distinguishes a major subgroup (70%–80% of lesions, termed 3q8pq20 subtype) from the remainder (20%–30% of lesions, non-3q8pq20). The 3q8pq20 subtype is associated with chromosomal instability and differential methylation in the most chromosomally unstable tumors. The two subtypes differ significantly in clinical outcome with risk for cervical (neck) lymph node metastasis almost exclusively associated with the 3q8pq20 subtype in two independent oral SCC cohorts.
Conclusions: Two subtypes of oral lesions indicative of at least two pathways for oral cancer development were distinguished that differ in chromosomal instability and risk for metastasis, suggesting that +3q,–8p, +8q, and +20 constitute a biomarker with clinical utility for identifying patients at risk for metastasis. Moreover, although increased numbers of genomic alterations can be harbingers of progression to cancer, dysplastic lesions lacking copy number changes cannot be considered benign as they are potential precursors to non-3q8pq20 locally invasive, yet not metastatic oral SCC. Clin Cancer Res; 17(22); 7024–34. ©2011 AACR.
Translational Relevance
Neck metastasis is the most significant clinical factor responsible for death from oral squamous cell carcinoma. Treatment for oral cancer almost always involves surgical excision of the tumor and cervical (neck) lymph nodes if there is a risk of metastasis. Because current methods for assessing presence of metastasis prior to curative surgery are imprecise and insensitive, many patients are unnecessarily treated for neck metastasis. Analysis of genomic aberrations present in oral precancers and cancers revealed 2 subtypes of these oral lesions that were distinguished by the presence of one or more copy number alterations at 4 genomic regions and indicative of at least 2 pathways for oral cancer development. The subtypes differ significantly in risk for metastasis, suggesting that, if validated in a subsequent larger multicenter trial, these differences in copy number aberrations constitute a biomarker with immediate clinical utility for identifying and planning treatment of patients based on risk for metastasis.
Introduction
Approximately 22,000 new cases of oral cavity cancer are reported each year and more than 95% are oral squamous cell carcinomas (SCC). The 5-year survival rate for patients with oral SCC, at 40%, is among the worst of all sites in the body, with no improvement over the past 40 years, and increasing incidence, particularly among young people and women (1–3). Although tobacco and alcohol are the major etiologic agents, oral cancer also commonly occurs in patients without these risk factors (4). In addition, certain infectious agents have been associated with head and neck cancer, including human papillomavirus (HPV), which is frequently associated with oropharyngeal cancer, but is much less frequent in oral cavity cancers (2%–4% of cases; refs. 5, 6).
Neck metastasis is the most significant clinical factor responsible for death from oral SCC, reducing survival rate by more than one-half (7, 8). Patients with oral cancer are also at risk for tumor recurrence and development of second primaries. Most oral SCCs (approximately 90%) are preceded by clinically evident precancerous oral lesions, including oral epithelial dysplasia, which appear as white (leukoplakia) or red patches (erythroplakia) that are characterized microscopically by varying degrees of dysplasia (from mild to severe; ref. 9). A recent metaanalysis found the transformation rate of dysplasia to oral SCC to be on average 10% for mild/moderate and 24% for severe dysplasia and carcinoma in situ (10).
Understanding the molecular basis of oral SCC progression and tumorigenesis can contribute to development of novel strategies for diagnosis, cancer risk assessment and classification, as well as targeted therapies for prevention and treatment. Genomic analysis is likely to be of particular utility because it is generally accepted that oral SCC develop via accumulation of genetic and epigenetic changes in a multistep process with aberrations being frequently recognized in premalignant lesions or in histologically normal tissue (11). Studies purporting to show that aneuploidy alone was the best predictor of progression from oral dysplasia to cancer, however, were subsequently discovered to have been founded on fabricated data (12). Therefore, to clarify the role of genomic aberrations in oral cancer progression and metastasis, we applied array comparative genomic hybridization (CGH) to determine the genome-wide spectrum of copy number aberrations in oral dysplasia and SCC cohorts. These studies revealed 2 subtypes of oral lesions that were distinguished by the presence of one or more copy number alterations at 4 genomic regions and indicative of at least 2 pathways for oral cancer development. Moreover, the subtypes differ significantly in risk for metastasis, suggesting that these differences in copy number aberrations constitute a biomarker with clinical utility for identifying and planning treatment of patients on the basis of risk for metastasis.
Patients, Materials, and Methods
Patients and tissue samples
We obtained formalin-fixed, paraffin-embedded (FFPE) SCC surgical resection specimens (SCC cohorts #1 and #2) from oral cavity sites (tongue, gingiva, floor of mouth, retromolar trigone, buccal mucosa, and lip) and associated clinical data through the University of California San Francisco Oral Cancer Tissue Bank and Cancer Registry (Table 1 and Supplementary Tables S1 and S2). Patient consent was obtained for use of all specimens. Cohort #1 was published previously (13). Data for 2 additional SCC cohorts were obtained from literature reports and are referred to as SCC cohorts #3 (14) and #4 (15). We also obtained dysplasia biopsy specimens, one group with no known association with cancer (cohort #D1) and one with dysplasias from patients who subsequently developed cancer at the site of the dysplasia or the dysplasia appeared at the site of a previous cancer (cohort #D2).
Summary of clinical characteristics of dysplasia and oral SCC cohorts from UCSF
. | Dysplasia . | Dysplasia . | SCC . | SCC . |
---|---|---|---|---|
. | (Cohort #D1, no known association with cancer) . | (Cohort #D2, associated with cancer) . | Cohort #1 . | Cohort #2 . |
. | n = 29 . | n = 10 . | n = 89 . | n = 63 . |
Age | ||||
<65 | 20 (69%) | 6 (60%) | 47 (53%) | 30 (48%) |
≥65 | 9 (31%) | 4 (40%) | 42 (47%) | 33 (52%) |
Sex | ||||
Female | 9 (31%) | 4 (40%) | 47 (53%) | 26 (41%) |
Male | 20 (69%) | 6 (60%) | 42 (47%) | 37 (59%) |
Grade | ||||
Mild | 12 (41%) | 3 (30%) | NA | NA |
Moderate | 8 (28%) | 4 (40%) | NA | NA |
Severe | 9 (31%) | 3 (30%) | NA | NA |
Moderately differentiated | NA | NA | 35 (39%) | 42 (67%) |
Moderate to poorly differentiated | NA | NA | 4 (4%) | 3 (5%) |
Moderate to well differentiated | NA | NA | 6 (7%) | 1 (2%) |
Poorly differentiated | NA | NA | 5 (6%) | 4 (6%) |
Well differentiated | NA | NA | 39 (44%) | 13 (21%) |
Site | ||||
Buccal mucosa | 4 (14%) | 0 | 17 (19%) | 9 (14%) |
Floor of mouth | 2 (7%) | 0 | 17 (19%) | 11 (17%) |
Gingiva | 1 (3%) | 1 (10%) | 21 (24%) | 11 (17%) |
Palate | 1 (3%) | 0 | 0 | 2 (3%) |
Tongue | 21 (72%) | 7 (70%) | 34 (38%) | 21 (33%) |
Retromolar trigone | 0 | 1 (10%) | 0 | 5 (8%) |
Lower lip | 0 | 1 (10%) | 0 | 0 |
Floor of mouth, tongue | 0 | 0 | 0 | 2 (3%) |
Floor of mouth, tongue, buccal mucosa | 0 | 0 | 0 | 1 (2%) |
Floor of mouth, tongue, gingiva | 0 | 0 | 0 | 1 (2%) |
TP53 mutation status | ||||
Wild type | 19 (66%) | 3 (30%) | 59 (66%) | NA |
Mutant | 7 (24%) | 2 (20%) | 16 (18%) | NA |
Unknown | 3 (10%) | 5 (50%) | 14 (16%) | NA |
Cancer association | ||||
Previous | Unknown | 4 (40%) | NA | NA |
Subsequent | Unknown | 5 (50%) | NA | NA |
Previous and subsequent | Unknown | 1 (10%) | NA | NA |
Tumor size (cm) | ||||
<2.7 | NA | NA | NA | 29 (46%) |
≥2.7 | NA | NA | NA | 33 (52%) |
Unknown | NA | NA | NA | 1 (2%) |
Tumor thickness (cm) | ||||
<1.3 | NA | NA | NA | 24 (38%) |
≥1.3 | NA | NA | NA | 26 (41%) |
Unknown | NA | NA | NA | 13 (21%) |
Clinical node status | ||||
Negative | NA | NA | NA | 25 (40%) |
Positive | NA | NA | NA | 14 (22%) |
Unknown | NA | NA | NA | 24 (38%) |
Pathologic node status | ||||
N0 | NA | NA | NA | 40 (63%) |
N+ | NA | NA | NA | 23 (37%) |
Recurrence | ||||
Not recurred | NA | NA | NA | 49 (78%) |
Recurred | NA | NA | NA | 12 (19%) |
Unknown | NA | NA | NA | 2 (3%) |
Vital status | ||||
Survived/censored | NA | NA | NA | 25 (40%) |
Dead | NA | NA | NA | 38 (60%) |
Tumor status | ||||
Free | NA | NA | NA | 44 (70%) |
Not free | NA | NA | NA | 15 (24%) |
Unknown | NA | NA | NA | 4 (6%) |
Alcohol use | ||||
Current | NA | NA | NA | 25 (40%) |
Never used | NA | NA | NA | 11 (17%) |
Previous use | NA | NA | NA | 7 (11%) |
Unknown | NA | NA | NA | 20 (32%) |
Tobacco use | ||||
Current cigarette smoker | NA | NA | NA | 19 (30%) |
Never used | NA | NA | NA | 12 (19%) |
Previous use | NA | NA | NA | 14 (22%) |
Current snuff/smokeless tobacco user | NA | NA | NA | 1 (2%) |
Unknown | NA | NA | NA | 17 (27%) |
. | Dysplasia . | Dysplasia . | SCC . | SCC . |
---|---|---|---|---|
. | (Cohort #D1, no known association with cancer) . | (Cohort #D2, associated with cancer) . | Cohort #1 . | Cohort #2 . |
. | n = 29 . | n = 10 . | n = 89 . | n = 63 . |
Age | ||||
<65 | 20 (69%) | 6 (60%) | 47 (53%) | 30 (48%) |
≥65 | 9 (31%) | 4 (40%) | 42 (47%) | 33 (52%) |
Sex | ||||
Female | 9 (31%) | 4 (40%) | 47 (53%) | 26 (41%) |
Male | 20 (69%) | 6 (60%) | 42 (47%) | 37 (59%) |
Grade | ||||
Mild | 12 (41%) | 3 (30%) | NA | NA |
Moderate | 8 (28%) | 4 (40%) | NA | NA |
Severe | 9 (31%) | 3 (30%) | NA | NA |
Moderately differentiated | NA | NA | 35 (39%) | 42 (67%) |
Moderate to poorly differentiated | NA | NA | 4 (4%) | 3 (5%) |
Moderate to well differentiated | NA | NA | 6 (7%) | 1 (2%) |
Poorly differentiated | NA | NA | 5 (6%) | 4 (6%) |
Well differentiated | NA | NA | 39 (44%) | 13 (21%) |
Site | ||||
Buccal mucosa | 4 (14%) | 0 | 17 (19%) | 9 (14%) |
Floor of mouth | 2 (7%) | 0 | 17 (19%) | 11 (17%) |
Gingiva | 1 (3%) | 1 (10%) | 21 (24%) | 11 (17%) |
Palate | 1 (3%) | 0 | 0 | 2 (3%) |
Tongue | 21 (72%) | 7 (70%) | 34 (38%) | 21 (33%) |
Retromolar trigone | 0 | 1 (10%) | 0 | 5 (8%) |
Lower lip | 0 | 1 (10%) | 0 | 0 |
Floor of mouth, tongue | 0 | 0 | 0 | 2 (3%) |
Floor of mouth, tongue, buccal mucosa | 0 | 0 | 0 | 1 (2%) |
Floor of mouth, tongue, gingiva | 0 | 0 | 0 | 1 (2%) |
TP53 mutation status | ||||
Wild type | 19 (66%) | 3 (30%) | 59 (66%) | NA |
Mutant | 7 (24%) | 2 (20%) | 16 (18%) | NA |
Unknown | 3 (10%) | 5 (50%) | 14 (16%) | NA |
Cancer association | ||||
Previous | Unknown | 4 (40%) | NA | NA |
Subsequent | Unknown | 5 (50%) | NA | NA |
Previous and subsequent | Unknown | 1 (10%) | NA | NA |
Tumor size (cm) | ||||
<2.7 | NA | NA | NA | 29 (46%) |
≥2.7 | NA | NA | NA | 33 (52%) |
Unknown | NA | NA | NA | 1 (2%) |
Tumor thickness (cm) | ||||
<1.3 | NA | NA | NA | 24 (38%) |
≥1.3 | NA | NA | NA | 26 (41%) |
Unknown | NA | NA | NA | 13 (21%) |
Clinical node status | ||||
Negative | NA | NA | NA | 25 (40%) |
Positive | NA | NA | NA | 14 (22%) |
Unknown | NA | NA | NA | 24 (38%) |
Pathologic node status | ||||
N0 | NA | NA | NA | 40 (63%) |
N+ | NA | NA | NA | 23 (37%) |
Recurrence | ||||
Not recurred | NA | NA | NA | 49 (78%) |
Recurred | NA | NA | NA | 12 (19%) |
Unknown | NA | NA | NA | 2 (3%) |
Vital status | ||||
Survived/censored | NA | NA | NA | 25 (40%) |
Dead | NA | NA | NA | 38 (60%) |
Tumor status | ||||
Free | NA | NA | NA | 44 (70%) |
Not free | NA | NA | NA | 15 (24%) |
Unknown | NA | NA | NA | 4 (6%) |
Alcohol use | ||||
Current | NA | NA | NA | 25 (40%) |
Never used | NA | NA | NA | 11 (17%) |
Previous use | NA | NA | NA | 7 (11%) |
Unknown | NA | NA | NA | 20 (32%) |
Tobacco use | ||||
Current cigarette smoker | NA | NA | NA | 19 (30%) |
Never used | NA | NA | NA | 12 (19%) |
Previous use | NA | NA | NA | 14 (22%) |
Current snuff/smokeless tobacco user | NA | NA | NA | 1 (2%) |
Unknown | NA | NA | NA | 17 (27%) |
NOTE: SCC cohort #1 from Snijders and colleagues (13).
Abbreviation: NA, not applicable.
For SCC cohort #2, we considered oral cavity SCC cases treated at the University of California, San Francisco (UCSF) Medical Center between 1998 and 2005 to be eligible for inclusion if patients were older than 21 years and they did not receive radiation or chemotherapy prior to tumor resection. We considered cases to be node positive (N+) if the histopathologic nodal status was positive at the time of surgical treatment or metastasis was identified during the 5-year follow-up period, whereas we considered patients to be node negative if pathologic nodal status was negative at the time of surgical resection and no nodal involvement occurred during a 5-year follow-up period. From the 2,500 cases in the bank, we were able to identify and accession tissue blocks for 64 cases for which the required clinical information was available and there was sufficient tumor material (i.e., tumors ≥1.5 cm in diameter) for analysis. We dissected regions of dysplasia or tumor from 15 consecutive 10 μm FFPE tissue sections from routine surgical excisions. The first and last sections were stained with hematoxylin and eosin and examined to confirm the diagnosis and grading of dysplasia, which was done by one pathologist (RCKJ), and to estimate the normal cell content of the regions of dysplasia and SCC selected for dissection, which varied from 60% to 90% epithelial cells. For the analysis of cohort #2, we also dissected regions of normal tissue, for example, muscle from the same patient blocks.
We note that consistent with the reports of low prevalence of HPV infection with oral SCC (5, 6), testing of a sample of cases from the UCSF Tissue Bank found 0 of 16 positive for oncogenic HPV.
Array CGH
We carried out copy number measurements on arrays of 2,464 bacterial artificial chromosome clones printed in triplicate as described previously (13). Array data analysis followed our published procedures with the exception that preprocessing for SCC cohort #2 took advantage of the availability of paired tumor and normal samples to reduce the noise commonly associated with copy number profiles generated with DNA extracted from archival specimens (Supplementary Methods). The array data sets are available at the National Center for Biotechnology Information Gene Expression Omnibus (NCBI GEO; GSE28407).
Statistical methods
P < 0.05 was considered significant, unless there was a multiple comparisons adjustment, in which case a Q < 0.05 was considered significant. Calculations were carried out with the R language (16).
Hierarchical clustering of tumor profiles
We grouped our samples and generated heat maps by unsupervised clustering of samples on trichotomous gain/loss/normal data for the autosomes. We used Euclidean distance as the distance metric and Ward's linkage as the agglomeration method.
Determination of recurrent regions of aberration
We defined recurrent common regions of aberration as contiguous clones for which the frequency of gain (or loss) occurred at greater than or equal to a specified frequency in a cohort. Within each recurrent region, we also defined recurrent focal regions as any local maxima in the frequency. In a new sample, we considered a previously specified region to be “gained” if more clones were gained than lost, “lost” if more clones were lost than gained, and “normal” if there were no gains or losses. Counts of aberrant regions were compared by the Wilcoxon rank-sum test.
To identify samples as 3q8pq20, we defined recurrent common regions with a frequency of more than 20% in the dysplasia cohort with no known association with cancer. We declared samples to be 3q8pq20 if one or more of the common recurrent gains on 3q, 8q, or 20 (encompassing a focal region on 20p including JAG1) or loss of 8p was present. Proportions of 3q8pq20 subjects were compared between cohorts by the Fisher exact test.
Evaluation of significant differences in recurrent aberrations in dysplasia and SCC
We compared dysplasias and SCC cohort #1 for differences in aberrations of chromosome arms or recurrent regions of aberration. For the regionwise comparison, we used a frequency cutoff of 20% in SCC cohort #2. Differences were evaluated by the Fisher exact test (17) using the dichotomized indicator gained (or lost)/not gained (or not lost), and the P values were adjusted for multiple testing by controlling the false discovery rate(18).
Evaluation of differences between 3q8pq20 and non-3q8pq20 tumors in SCC cohorts #1 and #2
Similar to the above analysis for regional differences, we identified differences in aberration frequencies in individual clones between 3q8pq20 and non-3q8pq20 cases in SCC cohorts #1 and #2 by the Fisher exact test. Differences in instability characteristics in SCC cohort #2 were evaluated by the Wilcoxon rank-sum test.
Associations with clinical characteristics
We compared patient and tumor characteristics with 3q8pq20 status, cervical node status, and genome instability measures by the Fisher exact test. We estimated survival curves by nodal status with the Kaplan–Meier method, and we tested for differential survival with the log-rank test.
Results
Copy number aberrations distinguish two oral dysplasia and cancer subtypes
We assembled a cohort of 39 oral dysplasia cases composed of lesional biopsies from 29 cases with no known association with cancer and 10 from patients who subsequently developed cancer at the site of the dysplasia or the dysplasia appeared at the site of a previous cancer (cohort #D1 and #D2, respectively; Table 1 and Supplementary Table S1). We compared these profiles to those of oral SCCs from 2 independent cohorts, cohort #1 (89 cases), which we had previously profiled (13) and cohort #2 with 63 cases with 5-year clinical follow-up (Table 1 and Supplementary Table S2).
Considering the dysplasia cases with no known association with cancer (cohort #D1), we found 4 regions of low level aberration (e.g., single copy gain and loss) that were each present in more than 20% of cases (Fig. 1A), including gains at 3q24-qter, 8q12-q24.2, and chromosome 20, and loss at 8pter-p23.1 (Table 2). The majority of the dysplasia cases (79%) harbored one or more of these recurrent aberrations, suggesting that these cases comprise a group, the 3q8pq20 subgroup, and the remaining 21% of cases, which lack +3q,–8p, +8q, and +20, the non-3q8pq20 subgroup. Dysplasia grade and TP53 mutation status were not associated with subgroup membership (Fig. 1C). Furthermore, analysis of a very limited number of dysplasia cases (n = 10) that progressed to cancer or arose at the site of a previously treated cancer (cohort #D2) revealed that 3q8pq20 and non-3q8pq20 subtypes were present in similar proportions as in the dysplasia cohort with no known association with cancer, 70% and 30%, respectively (Supplementary Fig. S1).
Copy number aberrations involving 3q, 8p, 8q, and chromosome 20 are frequent in oral dysplasia and occur at similar frequency in oral SCC. A and B, frequency of copy number aberrations shown in genome order for each clone on the array in 29 oral dysplasia samples with no known association with cancer (A) and oral SCC cohort #1 (B). We indicate gains by red bars ranging in frequency from 0 to 0.6 and losses by blue bars ranging from 0 to −0.6. Chromosome boundaries are indicated by vertical lines. The frequency scales are truncated at 0.6 for both gains and losses because no frequency exceeded this bound. C and D, hierarchical clustering based on DNA copy number profiles of 29 oral dysplasia samples with no known association with cancer (C) and oral SCC cohort #1 (D). We represented the individual clones on the array as rows, ordered by chromosome and genome position. The vertical band on the left side of the heat map indicates the genome positions of the clones on chromosomes. Clones on the p-arm are indicated either in light blue or yellow and clones on the q-arm in dark blue or green. We show acrocentric chromosomes in green or dark blue. Columns represent individual samples. A gain or a loss at a particular locus for a particular sample is indicated by red and blue, respectively, and focal amplifications by yellow dots. The bands across the top of the heat map indicate characteristics of each case: dysplasia grade (mild, light blue; moderate, dark blue; severe, purple), TP53 mutation status (TP53 mutant, dark blue; no detected mutation, light blue; TP53 status unknown, white), and the 3q8pq20 status (3q8pq20, dark blue; non-3q8pq20, light blue). E, frequencies of gains of 3q, 8q, 20 and loss of 8p in oral dysplasia and SCC normalized to the total number of aberrations at these loci in each cohort. F, frequency of 3q8pq20 and non-3q8pq20 cases in oral dysplasia and SCC.
Copy number aberrations involving 3q, 8p, 8q, and chromosome 20 are frequent in oral dysplasia and occur at similar frequency in oral SCC. A and B, frequency of copy number aberrations shown in genome order for each clone on the array in 29 oral dysplasia samples with no known association with cancer (A) and oral SCC cohort #1 (B). We indicate gains by red bars ranging in frequency from 0 to 0.6 and losses by blue bars ranging from 0 to −0.6. Chromosome boundaries are indicated by vertical lines. The frequency scales are truncated at 0.6 for both gains and losses because no frequency exceeded this bound. C and D, hierarchical clustering based on DNA copy number profiles of 29 oral dysplasia samples with no known association with cancer (C) and oral SCC cohort #1 (D). We represented the individual clones on the array as rows, ordered by chromosome and genome position. The vertical band on the left side of the heat map indicates the genome positions of the clones on chromosomes. Clones on the p-arm are indicated either in light blue or yellow and clones on the q-arm in dark blue or green. We show acrocentric chromosomes in green or dark blue. Columns represent individual samples. A gain or a loss at a particular locus for a particular sample is indicated by red and blue, respectively, and focal amplifications by yellow dots. The bands across the top of the heat map indicate characteristics of each case: dysplasia grade (mild, light blue; moderate, dark blue; severe, purple), TP53 mutation status (TP53 mutant, dark blue; no detected mutation, light blue; TP53 status unknown, white), and the 3q8pq20 status (3q8pq20, dark blue; non-3q8pq20, light blue). E, frequencies of gains of 3q, 8q, 20 and loss of 8p in oral dysplasia and SCC normalized to the total number of aberrations at these loci in each cohort. F, frequency of 3q8pq20 and non-3q8pq20 cases in oral dysplasia and SCC.
Recurrent regions of aberration at greater than 20% frequency in dysplasia with no known association with cancer (cohort #D1)
Region . | Aberration . | Start bp . | End bp . | Proximal clone . | Marker . | Distal clone . | Marker . | Maximum frequency . |
---|---|---|---|---|---|---|---|---|
3q24-qter | Gain | 145,059,218 | 198,022,429 | RP11-72E23 | AFM210VE7 | GS1-56H22 | 0.41 | |
8pter-p23.1 | Loss | 1 | 11,035,922 | GS1-77L23 | RP11-252K12 | SHGC-1962 | 0.34 | |
8q12-q24.2 | Gain | 61,101,900 | 134,150,084 | RP11-258B14 | SHGC-32354 | RP11-184M21 | SHGC-1948 | 0.52 |
20pter-qter | Gain | 1 | 63,025,519 | RP1-82O2 | RP1-81F12 | 0.28 |
Region . | Aberration . | Start bp . | End bp . | Proximal clone . | Marker . | Distal clone . | Marker . | Maximum frequency . |
---|---|---|---|---|---|---|---|---|
3q24-qter | Gain | 145,059,218 | 198,022,429 | RP11-72E23 | AFM210VE7 | GS1-56H22 | 0.41 | |
8pter-p23.1 | Loss | 1 | 11,035,922 | GS1-77L23 | RP11-252K12 | SHGC-1962 | 0.34 | |
8q12-q24.2 | Gain | 61,101,900 | 134,150,084 | RP11-258B14 | SHGC-32354 | RP11-184M21 | SHGC-1948 | 0.52 |
20pter-qter | Gain | 1 | 63,025,519 | RP1-82O2 | RP1-81F12 | 0.28 |
NOTE: Clone or sequence-tagged site marker positions according to February 2009 (hg19) assembly.
We noted that gains of 3q, 8q, and 20 and loss of 8p were frequent aberrations in both oral SCC cohorts (Fig. 1B and D, Supplementary Fig. S2), and the frequency did not differ from that in the dysplasia cases (Fig. 1E, Supplementary Table S3). Moreover, the frequency of tumors harboring one or more of these aberrations was not significantly different than the frequency in dysplasia cohort #D1 (Fig. 1F, 67% and 76%, P = 0.25 and 0.8 for SCC cohorts #1 and #2, respectively), suggesting that not only dysplasias, but also oral SCCs can be assigned to 3q8pq20 and non-3q8pq20 subtypes.
To confirm that the frequencies of the 2 subtypes were not simply a characteristic of oral cancers from Northern California, we accessioned an independent oral SCC array CGH data set from the Netherlands (14) composed of 29 cases (SCC cohort #3). We did not find a significant difference in the proportion of 3q8pq20 and non-3q8pq20 subtypes (75% and 25%, respectively; P = 0.76) among the 28 cases with copy number data of sufficient quality. Moreover, because these 28 cases had tested negative for HPV, these observations allow us to rule out HPV infection as an underlying determinant of subtype. Thus, 3q8pq20 and non-3q8pq20 subtypes and their relative proportions appear to be a universal feature of oral SCC cases from western countries.
Although dysplasia and oral SCCs share recurrent aberrations involving 3q, 8p, 8q, and chromosome 20, it is clear from Fig. 1 that copy number aberrations are more frequent in oral SCCs. For example, in the 89 SCCs of cohort #1, 11 aberrant loci occurred in 15% or more of cases including a loss at 8p12 that maps proximal to the region of loss at 8p shared by dysplasia and SCCs (Fig. 1B, Supplementary Table S4). Therefore, to identify copy number alterations that might distinguish precancers and cancers, we first defined recurrent gains and losses as those occurring at more than 20% frequency in SCC cohort #2, and then compared the frequency of recurrent aberrations in all 39 dysplasia cases (cohorts #D1 and D2) to those in the independent SCC cohort #1. This analysis found only the region +7pter-p11.2 (Q = 0.036) to be significantly more frequent in cancers (Supplementary Table S5), suggesting that upregulation of gene(s) in this region may occur late in progression to cancer.
Copy number aberrations are more frequent in the 3q8pq20 subtype
Hierarchical clustering of the cases in the 2 oral SCC cohorts revealed that recurrent low level gains and losses were not uniformly distributed (Fig. 1D and Supplementary Fig. S2). Indeed, we observed that recurrent aberrations were more frequent in the 3q8pq20 subtype, which also further subdivides into high and low instability tumors (Fig. 2, Supplementary Figs. S2 and S3). In addition, we observed a highly significant association of 3q8pq20 subtype with various types of chromosomal level genome instability (Supplementary Fig. S4), including, for example, differences in the fraction of the genome gained (FGG; P < 10−9), lost (P < 10−6), and altered (P < 10−8). On the other hand, although we more frequently observed mutations in exons 5 to 8 of TP53 (often associated with higher levels of genome instability) in the 3q8pq20 group of cohort #1 than the non-3q8pq20 group, the difference was not significant (the Fisher exact test, P = 0.12).
Distribution of low level gains and losses among 3q8pq20 and non-3q8pq20 oral SCC cases. Hierarchical clustering based on DNA copy number profiles of non-3q8pq20 (left) and 3q8pq20 (right) cases in SCC cohorts #1 (A) and #2 (B) as in Fig. 1C and D. The enhanced genomic instability associated with the 3q8pq20 subtype results in recurrent aberrations being more frequent in 3q8pq20 tumors (e.g., mean number of recurrent aberrations occurring at 15% or greater frequency in cohort #1 = 4.53, range 1 to 13 compared with non-3q8pq20 tumors with mean = 0.79, range 0 to 7). The bands across the top of the heat map show TP53 status as in Fig. 1 or nodal status (N+, dark blue and N0, light blue).
Distribution of low level gains and losses among 3q8pq20 and non-3q8pq20 oral SCC cases. Hierarchical clustering based on DNA copy number profiles of non-3q8pq20 (left) and 3q8pq20 (right) cases in SCC cohorts #1 (A) and #2 (B) as in Fig. 1C and D. The enhanced genomic instability associated with the 3q8pq20 subtype results in recurrent aberrations being more frequent in 3q8pq20 tumors (e.g., mean number of recurrent aberrations occurring at 15% or greater frequency in cohort #1 = 4.53, range 1 to 13 compared with non-3q8pq20 tumors with mean = 0.79, range 0 to 7). The bands across the top of the heat map show TP53 status as in Fig. 1 or nodal status (N+, dark blue and N0, light blue).
The 3q8pq20 tumors with high levels of chromosomal instability are differentially methylated
The lack of chromosome level instability in non-3q8pq20 tumors suggests that development of these tumors could be associated with other, copy number neutral, mechanisms, such as microsatellite instability or epigenetic alterations. Microsatellite instability is not common in oral cancer (19), whereas genome-wide alterations in methylation patterns are observed (15). Therefore, to investigate whether 3q8pq20 and non-3q8pq20 oral SCC subtypes differed in methylation patterns, we accessioned a published data set for a head and neck cancer patient cohort (SCC cohort #4) composed of 15 oral cavity and 4 oropharyngeal tumors (15) for which both copy number and methylation measurements were available (NCBI GEO accession numbers GSE20939 and GSE20742, respectively). We assigned 3q8pq20 status to the oral cavity cases (Supplementary Table S6). Hierarchical clustering using the top 10% most variable methylation probes (142 probes, Supplementary Table S7) revealed that differential methylation was associated with the cases with the greater number of copy number alterations (highly unstable 3q8pq20), as noted previously (15). The highly unstable 3q8pq20 cases clustered separately from the low genomic instability 3q8pq20, non-3q8pq20, and normal samples (Supplementary Fig. S5). The normal control cases also clustered together, whereas the non-3q8pq20 and low instability 3q8pq20 cases were somewhat intermixed, suggesting that extensive epigenetic alterations do not contribute to formation of non-3q8pq20 tumors.
Gene amplification occurs in dysplasia
In addition to the low level gains and losses discussed above, we observed that dysplasia genomes harbored amplifications, defined as focal regions of higher level increased copy number. Previously, we reported that oral SCCs characteristically amplify narrow regions of the genome (<3 Mb) and identified 18 such recurrent amplicons (13). In the 29 dysplasia cases with no known association with cancer, we found 2 of these amplicons at 11q13 (CCND1 and PAK1) and 20p12.2 (JAG1) to be present, as well as amplification at 2q11.2 in 2 dysplasia cases and 2 nonrecurrent amplicons at 20q13.33 and 21q21.3 (Fig. 1B, Table 3). The amplification at 21q21.3, however, spans a region that is gained in 15% or more of SCC cases (Supplementary Table S4) and a likely driver gene for this amplicon is MIR155. Although the 2q11.2 amplicon had not been observed previously in the 89 oral SCCs (13), we had reported it in an oral SCC cell line (20) and it has recently been reported by others in dysplasia (21). The recurrent amplicons are present in both 3q8pq20 and non-3q8pq20 dysplasia and SCC genomes (Figs. 1 and 2), and thus their formation seems to be mediated by processes independent of those driving low level gains and losses.
Amplicons in 29 dysplasia samples from patients with no known history of oral cancer (cohort #D1)
Dysplasia case no. . | Cyto-band . | Size (Mb) . | SCCa (%) . | Proximal flanking clone . | STS . | Distal flanking clone . | STS . | Candidate oncogene(s) . |
---|---|---|---|---|---|---|---|---|
5779; 5914 | 2q11.2 | 3.7 | 0b | RP11-327M19 | RP11-629A22 | AFMB355ZG1 | CIAO1, CNNM3 | |
5952 | 11q13.3 | 1.6 | 11 | CTD-2080I19 | RH7839 | RP11-120P20 | SHGC-4518 | CCND1, EMS1 |
5952 | 11q13.5 | 0.9 | 2 | CTC-352E23 | RH52308 | RP11-98G24 | SHGC-31540 | PAK1 |
6390 | 20p12.2 | 1.2 | 3 | RMC20P160 | WI-7829 | RMC20P178 | D20S186 | JAG1 |
6390 | 20q13.33 | 3.2 | 0 | RP11-94A18 | AFM218XE7 | RP11-358D14 | X70940 | CDH4, PSMA7 |
ADRM1, LAMA5, NTSR1, BIRC7 | ||||||||
5779 | 21q21.3 | 4.8 | 0c | RP11-86J21 | AFMA081WF1 | RP11-115H17 | SHGC-11277 | MIR155 |
Dysplasia case no. . | Cyto-band . | Size (Mb) . | SCCa (%) . | Proximal flanking clone . | STS . | Distal flanking clone . | STS . | Candidate oncogene(s) . |
---|---|---|---|---|---|---|---|---|
5779; 5914 | 2q11.2 | 3.7 | 0b | RP11-327M19 | RP11-629A22 | AFMB355ZG1 | CIAO1, CNNM3 | |
5952 | 11q13.3 | 1.6 | 11 | CTD-2080I19 | RH7839 | RP11-120P20 | SHGC-4518 | CCND1, EMS1 |
5952 | 11q13.5 | 0.9 | 2 | CTC-352E23 | RH52308 | RP11-98G24 | SHGC-31540 | PAK1 |
6390 | 20p12.2 | 1.2 | 3 | RMC20P160 | WI-7829 | RMC20P178 | D20S186 | JAG1 |
6390 | 20q13.33 | 3.2 | 0 | RP11-94A18 | AFM218XE7 | RP11-358D14 | X70940 | CDH4, PSMA7 |
ADRM1, LAMA5, NTSR1, BIRC7 | ||||||||
5779 | 21q21.3 | 4.8 | 0c | RP11-86J21 | AFMA081WF1 | RP11-115H17 | SHGC-11277 | MIR155 |
aFrequency reported in oral SCC cohort #1 by Snijders and colleagues (13).
bAlthough the 2q11.2 amplicon had not been observed previously in SCC cohort #1 as a recurrent amplicon (13), we had reported it in an oral SCC cell line (20) and it has recently been reported by others in dysplasia (21).
cThe region is gained in 15% or more of SCC cases.
Abbreviation: STS, sequence-tagged site.
Oral cancer subtypes differ in clinical behavior
Considered together the distribution of copy number aberrations in dysplasia and SCC suggest that there are 2 distinct routes to oral cancer, one associated with greater genome instability and acquisition of +3q,–8p, +8q, and/or +20 in premalignant stages and the other lacking chromosomal level instability detectable by CGH. Potential differences in developmental pathways leading to oral cancer are likely to impact clinical behavior. Indeed, we observed a highly significant association of 3q8pq20 status with pathologic cervical (neck) lymph node status (OR = 11.5, CI = 1.5–521.8; Fisher exact test, P = 0.006), that is, neck metastasis (N+) was present in 46% (22 of 48) of 3q8pq20 tumors and in only 7% (1 of 15) of non-3q8pq20 tumors (Table 4 and Supplementary Table S10).
Biomarker prediction of pathologic cervical node status in two independent oral SCC cohorts
. | Cohort #2 (n = 63) . | Cohort #3 (n = 16) . | ||
---|---|---|---|---|
Nodal status | N0 | N+ | N0 | N+ |
3q8pq20 | 26 | 22 | 3 | 10 |
non-3q8pq20 | 14 | 1 | 3 | 0 |
Sensitivity | 0.96 | 1 | ||
Specificity | 0.35 | 0.5 | ||
Positive predictive value | 0.46 | 0.77 | ||
NPV | 0.93 | 1 | ||
P | 0.0058 | 0.036 | ||
Sample OR | 11.85 (CI: 1.52–521.82) |
. | Cohort #2 (n = 63) . | Cohort #3 (n = 16) . | ||
---|---|---|---|---|
Nodal status | N0 | N+ | N0 | N+ |
3q8pq20 | 26 | 22 | 3 | 10 |
non-3q8pq20 | 14 | 1 | 3 | 0 |
Sensitivity | 0.96 | 1 | ||
Specificity | 0.35 | 0.5 | ||
Positive predictive value | 0.46 | 0.77 | ||
NPV | 0.93 | 1 | ||
P | 0.0058 | 0.036 | ||
Sample OR | 11.85 (CI: 1.52–521.82) |
The presence of metastases to the cervical lymph nodes is a major determinant of survival for oral SCC patients (7, 8). The differential risk for metastasis in the 3q8pq20 and non-3q8pq20 oral SCC subtypes indicates that chromosomal aberrations +3q,–8p, +8q, and +20 provide a potential biomarker to identify patients with no or low risk of metastasis. To confirm this observation, we investigated the association of nodal status and 3q8pq20 status in the independent oral SCC cohort #3 from the Netherlands (14) for which copy number and pathologic node status were available (Table 4). In cohort #3, we also found the non-3q8pq20 subtype to be at low risk for metastasis (Fisher exact test, P = 0.036). We note in particular that the negative predictive value (NPV) for metastasis (i.e., ability to predict N0 cases at the time of biopsy) was 93% in SCC cohort #2 and 100% in the Dutch cohort (Table 4). By contrast, the NPVs for other clinical characteristics associated with metastasis were 75% and 80% for tumor size (<2.0 cm) and thickness (≤5 mm), respectively (Supplementary Table S11). We also observed a modest association with age in cohort #2, non-3q8pq20 tumors were more frequent in patients older than 65 years (P = 0.018, Supplementary Table S10), but not in the Dutch cohort.
Because the 3q8pq20 and non-3q8pq20 subtypes also differ in genomic instability, we considered association of genome instability measures with clinical characteristics in cohort #2. On the other hand, although genome instability is commonly reported to be correlated with measures of poor prognosis, we found no association of any genome instability measures with recurrence free survival, disease free survival, or overall survival in cohort #2 (log-rank test, data not shown). On the other hand, we observed significant association of nodal status with increased numbers of whole chromosome copy number changes (P = 0.046), FGG (P = 0.004), and fraction of the genome altered (FGA, P = 0.024), suggesting that these measures may also serve as biomarkers of nodal status (Supplementary Table S12). We did not, however, find a clear cut point for prediction of nodal status by either measure (Supplementary Fig. S7). Nevertheless, by applying maximally selected χ2 statistics (22), we obtained cut points at 0.065 and 0.095 for FGG and FGA, respectively, yielding sensitivity, specificity, positive predictive value, and NPV of 74%, 68%, 57%, and 82% for FGG and 91%, 48%, 50%, and 90% for FGA compared with 96%, 35%, 46%, and 93% for 3q8pq20 status (Table 4). Thus, with these cut points, FGG and FGA both correctly identify more of the true N0 cases; however, more N+ cases are mistakenly called N0.
In addition, we observed previously described associations with positive nodal status (7, 8), including increased tumor size (P = 0.018), tumor thickness (P = 0.010), and reduced survival (Supplementary Table S13 and Supplementary Fig. S8), providing evidence that the clinical behavior of tumors in cohort #2 is similar to other oral SCC cohorts. We did not, however, identify individual copy number aberrations on a clonewise basis that were significantly associated with clinical characteristics (i.e., nodal status, age, gender, tumor size, tumor thickness, site, and tobacco or alcohol use) after correction for multiple testing (Supplementary Fig. S9). In addition, we found only tumor size to be associated with nodal status among patients with 3q8pq20 tumors (Supplementary Table S14). Assessment of other characteristics (e.g., gene expression signatures) will be required to determine whether it is possible to stratify 3q8pq20 patients for risk of metastasis.
Discussion
By comparison of recurrent copy number alterations in oral precancers and cancers, we have obtained evidence that there are at least 2 pathways of oral cancer development, distinguished by acquisition of one or more of the aberrations +3q,–8p, +8q, and/or +20 in dysplastic lesions. Our observations raise questions as to mechanism—the identity of the genes in these regions (3q, 8p, 8q, and 20) and the functional consequences of their gain or loss that provide a growth advantage when at altered copy number early on in the precancers (dysplasia). Identifying the genes from the copy number data alone is challenging, as the involved regions are large and generally of uniformly low copy number gain or loss. Losses involving 8p and gains involving 3q, 8q, and 20q occur frequently in cancers. Some insight into the genes that may be playing a role in deregulating growth in precancerous lesions may be obtained by considering candidate oncogenes and tumor suppressors that have been suggested for these regions based on finding that they are amplified or deleted in tumors. It is important to bear in mind, however, that candidate oncogenes mapping to regions of low level gains in precancers may function differently than they do when at highly elevated copy number. Moreover, the ensemble of genes within these large regions (i.e., the balance of oncogene and tumor suppressor functions) may together promote the preneoplastic changes. Nevertheless, taking this approach, JAG1 seems to be a likely candidate on chromosome 20p, as we found it to be amplified in dysplasia (Table 3), as well as upregulated when amplified in cancer (13). We also observed amplification at 20q11 in SCC cohort #1, suggesting BCL2L1, DNMT3B, E2F1, NCOA6, TGIF2, and ITCH as candidate oncogenes that could be contributing to the early deregulation of growth. Similarly, candidate oncogenes on 8q identified in oral SCC include YWHAZ (23), MYC, PVT1, and associated miRNAs. Analysis of recurrent regions of amplification on 3q in our oral SCC cohorts found 4 regions, all of which harbor candidate oncogenes previously reported to be amplified or upregulated in cancer (Supplementary Fig. S10).
Notably, the 2 subtypes, we identified, differ in clinical behavior, the non-3q8pq20 SCCs being associated with a very low risk for cervical node metastasis, suggesting that 3q8pq20 status is a potential biomarker for risk of metastasis. Treatment for oral cancer is almost always surgical. Identification of patients with node positive necks is the most important question to be accurately answered prior to surgical resection of the tumor, as well as for postsurgical treatment and follow-up (24). Typically, patients are assessed prior to surgery for lymph node metastases by palpation of the lymph nodes in the neck and by imaging (computed tomography, MRI, positron emission tomography scan). For patients with clinically node-negative necks, treatment options include a “wait and see” approach or elective neck dissection (i.e., carrying out a neck dissection when there is no clinical or radiographic evidence of neck metastasis) if the chance of metastasis is more than 20% based on current risk assessment capability (24). Currently, tumor thickness is considered the best predictor of metastasis. Because it is difficult to assess this parameter from the incisional biopsy prior to surgery (24), the American Joint Commission on Cancer (AJCC) tumor—node—metastasis staging protocol, which is based on surface diameter of the tumor (25) is often used to assess likelihood of metastasis. It is common in clinical practice to not recommend neck dissections if tumors are less than 2 cm in size (stage T1) and thickness less than 3 mm. Occult metastatic rates for oral SCCs, however, are high and range from 20% to 45% for T1 tongue SCCs (24). Thus, the failure to find evidence of metastasis on clinical exam provides little confidence that the patient does not require removal of the cervical lymph nodes. For this reason, in many medical centers, patients are routinely offered elective neck dissection, which subjects some patients to unnecessary surgery. For example, none of the node negative non-3q8pq20 tumors met the above-mentioned criteria for recommending that the patient could forgo a neck dissection. In addition, 2 of the 14 node negative non-3q8pq20 cases were diagnosed as clinically node positive, but subsequently found to be node negative by pathology (Supplementary Table S2). Assessment of 3q8pq20 status prior to surgery would have added prognostic value and could have spared these 14 patients from unnecessary surgery. Moreover, our initial findings—non-3q8pq20 tumors have less than a 7% chance of metastasis—is well below the current 20% risk threshold. On the other hand, patients with 3q8pq20 are at substantial risk for metastasis (46%). All these considerations support the potential utility of assessing 3q8pq20 status at the time of diagnostic biopsy to substantially improve clinical decisions regarding elective neck dissection.
We also find that FGG and FGA are correlated with risk for metastasis, although we did not find a clear cut point for either measure. Using cut points of 0.065 and 0.095 for FGG and FGA, respectively, we correctly identified more of the N0 cases than we did on the basis of 3q8pq20 status; however, more N+ cases are mistakenly called N0, which in the clinic may outweigh the benefits of detecting more N0 patients because of the extremely poor survival of patients who undergo surgical salvage for neck metastasis. Larger studies will be required to determine the utility of FGG, FGA, and 3q8pq20 status as biomarkers for cervical node status. For application in the clinic, however, it is likely that evaluation of 3q8pq20 (four loci) will have an advantage, because it would be more amenable to measurement with less complex biomarker assays (e.g., FISH) than would be assessment of genome-wide copy number alterations to determine FGG or FGA. Because eliminating unnecessary neck dissections would reduce surgical risks, patient morbidity, lengthy surgeries (typically 10 hours), and hospitalization time, further multicenter validation studies are clearly warranted.
There are a growing number of tumor types for which subtypes have been identified that lack copy number instability (14, 26–28). Better prognosis is often associated with these subtypes. In oral cancer, the non-3q8pq20 subtype is clearly a member of this group as there is low genomic instability and a low risk of metastasis. The driving force for these tumors remains obscure. The non-3q8pq20 oral tumors do not seem to have distinguishing methylation profiles or microsatellite instability, leaving open the possibility that there are underlying copy neutral chromosomal rearrangements or extensive mutations in oncogenes and tumor suppressors in this subtype. On the other hand, these tumors may be promoted by extrinsic factors that modify growth of epithelial cells, including inflammation and aberrant behavior of neighboring cells (29). Infection with microorganisms is another candidate; bacteria have been reported in association with certain cancers (30, 31) and also to modify growth signaling pathways in epithelial cells (31, 32).
Conclusion
Copy number analysis of oral cancers and precancers revealed 2 subtypes, 3q8pq20 and non-3q8pq20, distinguished by acquisition of specific copy number alterations in the early precancerous lesions. The 2 subtypes are likely to develop by different pathways that result in tumors differing in their clinical behavior, namely, risk for metastasis. In addition, we note that although much attention has focused on regions of genomic imbalance as biomarkers of progression because they are present at greater frequency in oral SCCs than precancers (33), such markers, at best, can only report on the likelihood of progression of the 3q8pq20 subtype. They cannot provide information on progression of chromosomally stable non-3q8pq20 lesions.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Acknowledgments
The authors thank members of the UCSF Helen Diller Family Comprehensive Cancer Center Microarray Shared Resource Facility for production of BAC arrays. Serge Smeets and Karl Kelsey and their colleagues facilitated access to their published data.
Grant Support
This work was supported by NIH grants CA84118, CA90421, CA118323, and CA131286 to D.G. Albertson and CA113833 to B.L. Schmidt. A. Bhattacharya was the recipient of a predoctoral fellowship from the California Tobacco-Related Disease Research Program (18DT-0011).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
References
Supplementary data
PDF file - 84K
PDF file - 1MB
PDF file - 1MB