Abstract
Purpose: To perform methylation array analysis of 807 cancer-associated genes using tissue and saliva of oral squamous cell carcinoma (OSCC) patients with the objective of identifying highly methylated gene loci that hold diagnostic and predictive value as a biomarker.
Experimental Design: We did the methylation array on DNA extracted from preoperative saliva, postoperative saliva, and tissue of 13 patients with OSCC, and saliva of 10 normal subjects. We identified sites that were highly methylated in the tissue and preoperative saliva samples but not methylated in the postoperative saliva samples or in normal subjects.
Results: High quality DNA was obtained and the methylation array was successfully run on all samples. We identified significant differences in methylation patterns between the preoperative and postoperative saliva from cancer patients. We established a gene classifier consisting of 41 gene loci from 34 genes that showed methylation in preoperative saliva and tissue but were not methylated in postoperative saliva or normal subjects. Gene panels of 4 to 10 genes were constructed from genes in the classifier. The panels had a sensitivity of 62% to 77% and a specificity of 83% to 100% for OSCC.
Conclusions: We report methylation array analysis of 807 cancer-associated genes in the saliva of oral cancer patients before and after oral cancer resection. Our methylation biomarker approach shows the proof of principle that methylation array analysis of saliva can produce a set of cancer-related genes that are specific and can be used as a composite biomarker for the early detection of oral cancer. (Cancer Epidemiol Biomarkers Prev 2008;17(12):3603–11)
Introduction
Biomarker detection within biological fluids shows promise for the early diagnosis of cancer. In particular, evaluation of fluids approximating the cancer has significant clinical applicability. For example, sputum analysis has been used to detect lung carcinoma (1). In similar fashion, saliva is the proximal fluid for head and neck squamous cell carcinoma (SCC). The cellular and fluid content of whole saliva, which includes protein, genetic, and epigenetic changes, has been studied in head and neck SCC. Promoter hypermethylation is an epigenetic change that involves the addition of methyl groups to cytosine residues in the context of a CpG dinucleotide. This usually occurs in the promoter region of a gene, which contains a high density of CpG dinucleotides, termed CpG islands. The methyl group will interfere with transcriptional proteins resulting in long-term silencing of that gene. Promoter hypermethylation is a critical step in oral carcinogenesis and has a number of significant advantages over genetic and protein diagnostic markers. Epigenetic silencing events (i.e., promoter hypermethylation) are more frequent mechanisms of gene silencing than genetic changes, making it a more attractive marker than detecting a genetic mutation or measuring gene expression. It is one of the earliest events in oral carcinogenesis, preceding protein expression level changes. In fact, promoter hypermethylation is a more frequent mechanism in gene silencing than genetic mutation (2). Because DNA methylation leads to gene silencing (a negative biological event), protein is not produced and immunohistochemistry or ELISA cannot be used in a clinical setting. For a diagnostic test to be implemented clinically, the test must measure a positive event. Therefore, by analyzing for DNA methylation, we can turn a negative biological event into a positive clinical test. Previous studies analyzing promoter hypermethylation have looked at a panel of 2 to 20 genes to establish the sensitivity of detection of head and neck cancer (3-9). In these studies, genes have been chosen based on their known role in head and neck carcinogenesis. The specificity and sensitivity of this approach has not yielded a gene panel that is viable in a clinical setting. Moreover, current techniques to measure promoter methylation, which include combined bisulfite restriction analysis (10), quantitative methylation-specific PCR (MSPCR; refs. 11, 12) and pyrosequencing (13), are inadequate for genome-wide methylation analysis because the labor required for such an analysis with one of these techniques is prohibitive. However, once a gene panel consisting of a manageable number of genes has been developed using a genome-wide approach, one of the above technical approaches could be implemented in a clinical laboratory on a routine basis. We sought to discover genes that have not been previously studied in head and neck cancer that might hold significant diagnostic value. We targeted our discovery approach by analyzing the matched preoperative saliva, postoperative saliva, and tissue of a specific head and neck cancer subsite (oral cavity) with a newly available methylation array (GoldenGate Methylation Array; Illumina), which includes 1,505 CpG loci covering 807 genes (14). We then converged on a panel of gene loci that served as our classifier.
Materials and Methods
Collection of Saliva and Tissue
The project was approved by the University of California San Francisco Committee on Human Research. We enrolled oral SCC patients with the following inclusion criteria:
biopsy-proven oral cavity SCC with residual SCC after biopsy based on clinical examination
no history of prior surgical, chemotherapeutic, or radiation treatment for head and neck SCC.
We collected preoperative saliva, postoperative saliva, and tissue specimens from oral SCC patients and saliva from normal subjects. Demographic and health information were recorded for each patient including age, sex, tobacco, and alcohol habits. Cancer patients were staged according to the American Joint Commission on Cancer tumor-node-metastasis staging system (15). Whole saliva (7.5 mL) was collected from patients between 6:30 a.m. and 8:00 p.m. after they had nothing per oral for 12 h. Cancer tissue was collected from the same patients. Saliva was collected under similar conditions (time of day and nothing per oral) for 10 normal subjects without a history of an upper aerodigestive tract lesion. After collection, saliva and tissue samples were stored in −80°C as we have previously described (16, 17). The postoperative samples were collected ∼4 wk after surgery. If radiation therapy was to be given, the postoperative saliva sample was collected before beginning radiation therapy. Similar conditions (time of day and nothing per oral) were used for collection of the postoperative saliva samples. The pathology reports were reviewed to confirm complete removal of the SCC at the primary site. If there was evidence of residual carcinoma, the patient was removed from the study. All patients were followed for at least two years to confirm no evidence of recurrence.
Extraction of DNA From Saliva and Cancer Tissues
Genomic DNA was extracted from saliva of normal subjects and SCC patients, and tissue of SCC patients. Genomic DNA was then extracted from 1,000 μL saliva with a commercially available automated DNA extraction kit (iPrep Chargeswitch Buccal Cell kit; Invitrogen). Briefly, 1 mL of whole saliva was washed and centrifuged with 4 mL of PBS. The supernatant was decanted and the cell pellet was resuspended in 1 mL of lysis mix provided in the kit and 10 μL of proteinase K. The sample was incubated at 37°C for 20 min. The DNA was extracted from each sample using the iPrep Purification Instrument (Invitrogen) with the Buccal Protocol setting with a total run time of 20 min. DNA was eluted with 150 μL elution buffer. For each patient from whom saliva was collected, the paraffinized tissue blocks containing carcinoma were obtained. Ten 10-μm sections were cut from the blocks, and genomic DNA was then extracted with the QiaAmp Blood kit under the Paraffinized Tissue protocol (Qiagen) and eluted in 200-μL elution buffer from the kit. DNA yield and quality were assessed with spectrometry (Nanodrop Technologies). Five hundred nanograms of genomic DNA were then chemically modified with sodium bisulfite to convert all unmethylated cytosines to uracils, whereas leaving methylated cytosines unconverted (EZ DNA Methylation kit; Zymo Research). To minimize variability, bisulfite conversion was done on a 96-well plate. Bisulfite conversion using the EZ DNA Methylation kit has been shown to be 99.7% efficient and is most compatible with the Illumina GoldenGate Methylation Array (14).
Description of Illumina GoldenGate Methylation Array
We analyzed our samples using the GoldenGate Methylation Array (Illumina), which is an array-based platform allowing probes to query potential methylation sites after bisulfite conversion of genomic DNA. The panel includes 1,505 CpG loci that are selected from 807 cancer-related genes (14). The genes on the array have been selected for their biological relevance to cancer. These genes include tumor suppressor genes, oncogenes, genes involved in DNA repair, cell cycle genes, differentiation, apoptosis, X-linked, and imprinted genes (18-20). Fluorescence data were analyzed with BeadStudio software (Illumina). The Background Normalization Algorithm was used to minimize background variation within the array by using built-in negative control signals. Negative controls allow for estimation of the expected signal level without hybridization to a specific target. The average of all negative control signals are subtracted from the probe signals. Gene loci coverage was redundant and the signal was reported as a β value. The β value was an average of multiple measurements taken from redundant probes for the queried gene locus.
The methylation assay was run through the University of California, San Francisco Genome Core.1
The GoldenGate Assay was used to detect methylation of 1,505 CpG sites in promoter regions of 807 genes. Hybridization of the DNA to a set of allele specific oligonucleotide and locus-specific oligonucleotide was dependent on the methylation status at the CpG site of interest. After the hybridization step, extension of the appropriate allele-specific oligonucleotide and ligation of the product to the locus-specific oligonucleotide created a product that provided a template for PCR using universal PCR primers. Universal primer 1 was Cy3 labeled and amplified the unmethylated template DNA, whereas universal primer 2 was Cy5 labeled and amplified the methylated template DNA. After amplification, the dye-labeled DNA was hybridized to its complement bead type through the unique IllumiCode address. The fluorescence signal was analyzed on a Sentrix Array Matrix, with Cy5-fluorescence representing methylation at the CpG locus and Cy3-fluorescence representing an unmethylated CpG locus. A methylation analysis algorithm was used to obtain methylation data for individual loci in individual samples, expressed as the β value. The β value was used to estimate the methylation level of the CpG locus using the ratio of intensities between methylated and unmethylated alleles. β is calculated as:The SD of the β value for all of the 1,505 CpG sites across the replicates on the array has been shown to be <0.06 in 99% of cases and the array can discriminate levels of methylation that differ by as little as 0.17 (14).
Construction of the Methylation Classifier
Methylated gene loci from the following four groups of samples were included in our analysis: (a) preoperative saliva, (b) postoperative saliva, (c) cancer tissue (a, b, and c were patient-matched), and (d) saliva from normal subjects. The investigator analyzing the methylation array results was blinded to the condition of the patients and subjects. We constructed our methylation classifier based on the following criteria:
the classifier contains methylated genes common to both the preoperative saliva and cancer tissue
the classifier excludes the methylated genes present in either the postoperative saliva or normal saliva.
We first identified unmethylated genes in the postoperative saliva and normal saliva samples. Gene loci from the postoperative saliva and normal saliva samples that had a threshold β value below 0.1 were considered unmethylated and were retained for further analysis. Once this pool of unmethylated genes was produced, we analyzed the methylated genes present in preoperative saliva. Gene loci with a β threshold value above 0.2 in any of the preoperative saliva samples were considered methylated and were retained. With the list of methylated gene loci present in the preoperative saliva samples but not present in the pool of postoperative saliva and normal saliva, we looked for methylated genes that were also present in the cancer tissue samples. For identification of gene methylation in the cancer tissues, the β threshold value was again set at 0.2. This final set of methylated genes constituted our classifier. Specificity and sensitivity were calculated for each gene locus.
Agglomerative Hierarchical Clustering of Saliva Samples Based Upon Methylated Gene Loci
Fluorescent signal values of the 1,505 gene loci from all samples were imported into the BeadStudio software (Ilumina). We established a mathematical expression filter by using the inclusion and exclusion criteria listed above for the classifier. Subsequently, a heat map was created based on the data matrix representing the β values for the preoperative and postoperative saliva samples across the 41 gene loci within the classifier. Values represented on the heat map are β values. We then did hierarchical clustering within the heat map.
Development of Gene Panels from Classifier
Gene panels, consisting of 4 to 10 gene loci chosen from the 41 gene loci classifier, were developed to potentially be used for a clinical trial. The purpose of developing such a gene panel was to satisfy the technical and financial constraints of a diagnostic laboratory assay. With this objective in mind, we reviewed the 41 gene loci within the classifier to develop multiple gene panels consisting of 4 to 10 gene loci each. Combinations of gene loci were evaluated for sensitivity and specificity. The combinations that produced the highest sensitivity and specificity were chosen. Gene loci that were methylated in different preoperative saliva samples were grouped together within a panel to maximize sensitivity. A total of nine gene panels were constructed, which included all of the 41 gene loci at least once. The sensitivity and specificity for each of these gene panels was then calculated.
Results
Oral SCC Patient and Normal Subject Demographics
The patient demographic, tumor characteristics, and outcome of the oral SCC patients enrolled in the study are listed in Table 1. The group consisted of 11 men and 2 women. The demographic information for the normal subjects is listed in Table 2. The average age of the study group was 60.8 years (median, 61; range, 35-81). The group of normal subjects consisted of eight men and two women. The average age of the normal subjects was 45.5 years (median, 41; range, 32-64). Pathology reports for all patients were reviewed and total surgical resection was confirmed. Tumor staging was also confirmed. Each patient was followed each month after surgery for the first year and every 2 months for the second year. Postoperative saliva was collected at the first month postoperative visit. No patient experienced delayed healing or infection. In all cases, postoperative saliva was collected before starting radiation therapy in those patients requiring radiation therapy. Postoperative saliva was collected under the same conditions as preoperative saliva. Whole saliva was immediately frozen at −80°C after collection. None of the patients in the study developed local recurrence. Two of the patients died of cervical metastasis. For the 11 patients that survived, the average follow up was 16.3 months (median, 13 months; range, 10-30). None of the oral SCC patients or normal subjects had a cancer of another histologic type during the period of the study.
Patient . | Age (y) . | Race . | Gender . | Tumor location . | TNM tumor staging . | Outcome . |
---|---|---|---|---|---|---|
Patient 1 | 54 | Caucasian | M | Tongue | T1N0M0 | NED, 30 mo |
Patient 2 | 70 | Hispanic | M | Maxillary gingiva | T4N2cM0 | Died from neck metastasis at 6 mo |
Patient 3 | 70 | Asian | M | Tongue | T1N0M0 | Died from neck metastasis at 15 mo |
Patient 4 | 55 | Asian | M | Tongue | T1N1M0 | NED, 20 mo |
Patient 5 | 35 | Caucasian | M | Tongue | T1N2bM0 | NED, 12 mo |
Patient 6 | 81 | Caucasian | F | Tongue | T1N0M0 | NED, 17 mo |
Patient 7 | 51 | Caucasian | M | Tongue | T1N0M0 | NED, 12 mo |
Patient 8 | 62 | Caucasian | M | Tongue | T1N0M0 | NED, 15 mo |
Patient 9 | 59 | Caucasian | M | Tongue | T1N0M0 | NED, 12 mo |
Patient 10 | 79 | Caucasian | M | Mandible | T4aN1M0 | NED, 13 mo |
Patient 11 | 61 | Caucasian | M | Floor of mouth | T1N0M0 | NED, 13 mo |
Patient 12 | 69 | Caucasian | M | Tongue | T2N0M0 | NED, 12 mo |
Patient 13 | 45 | Asian | F | Tongue | T3N0M0 | NED, 10 mo |
Patient . | Age (y) . | Race . | Gender . | Tumor location . | TNM tumor staging . | Outcome . |
---|---|---|---|---|---|---|
Patient 1 | 54 | Caucasian | M | Tongue | T1N0M0 | NED, 30 mo |
Patient 2 | 70 | Hispanic | M | Maxillary gingiva | T4N2cM0 | Died from neck metastasis at 6 mo |
Patient 3 | 70 | Asian | M | Tongue | T1N0M0 | Died from neck metastasis at 15 mo |
Patient 4 | 55 | Asian | M | Tongue | T1N1M0 | NED, 20 mo |
Patient 5 | 35 | Caucasian | M | Tongue | T1N2bM0 | NED, 12 mo |
Patient 6 | 81 | Caucasian | F | Tongue | T1N0M0 | NED, 17 mo |
Patient 7 | 51 | Caucasian | M | Tongue | T1N0M0 | NED, 12 mo |
Patient 8 | 62 | Caucasian | M | Tongue | T1N0M0 | NED, 15 mo |
Patient 9 | 59 | Caucasian | M | Tongue | T1N0M0 | NED, 12 mo |
Patient 10 | 79 | Caucasian | M | Mandible | T4aN1M0 | NED, 13 mo |
Patient 11 | 61 | Caucasian | M | Floor of mouth | T1N0M0 | NED, 13 mo |
Patient 12 | 69 | Caucasian | M | Tongue | T2N0M0 | NED, 12 mo |
Patient 13 | 45 | Asian | F | Tongue | T3N0M0 | NED, 10 mo |
Abbreviations: TNM, tumor-node-metastasis; NED, no evidence of disease.
Patient . | Age (y) . | Race . | Gender . |
---|---|---|---|
Patient 1 | 32 | Asian | F |
Patient 2 | 35 | Asian | M |
Patient 3 | 34 | Caucasian | M |
Patient 4 | 36 | Caucasian | M |
Patient 5 | 39 | Caucasian | M |
Patient 6 | 50 | Black | M |
Patient 7 | 62 | Caucasian | M |
Patient 8 | 60 | Caucasian | F |
Patient 9 | 43 | Asian | M |
Patient 10 | 64 | Caucasian | M |
Patient . | Age (y) . | Race . | Gender . |
---|---|---|---|
Patient 1 | 32 | Asian | F |
Patient 2 | 35 | Asian | M |
Patient 3 | 34 | Caucasian | M |
Patient 4 | 36 | Caucasian | M |
Patient 5 | 39 | Caucasian | M |
Patient 6 | 50 | Black | M |
Patient 7 | 62 | Caucasian | M |
Patient 8 | 60 | Caucasian | F |
Patient 9 | 43 | Asian | M |
Patient 10 | 64 | Caucasian | M |
Methylation Array Results for Saliva and Cancer Tissue
DNA was successfully isolated from each saliva and cancer tissue sample. The average DNA extracted from each 1 mL of saliva was 50 ng/μL (range, 20-80 ng/μL) with a total volume of 150 μL. For the array, at least 500 ng of genomic DNA was extracted from each sample and was treated with sodium bisulfite. The DNA was labeled and hybridized to the methylation array containing the 1,505 CpG loci covering 807 cancer-related genes. Fluorescent signals from each sample were quantified, normalized, and converted to a β value. Replicates were not necessary for the array because of previous demonstration of a consistent β value across duplicates.
Selection of Genes Comprising the Classifier
Selection of genes was based on the criteria described in Materials and Methods. To begin the postoperative, saliva samples and normal saliva samples were evaluated. All gene loci in which at least 90% of the normal and postoperative samples had a β value of <0.1 were included. From those gene loci that were included, we evaluated genes with a β value of >0.2 in preoperative samples. Out of 1,505 gene loci, 64 gene loci met the above two criteria.
Bivariate Correlation of Preoperative Saliva and Cancer Tissue Samples
We subsequently characterized the cancer tissue using the same criteria as described above for preoperative saliva (i.e., samples with a β value of >0.2 were classified as methylated). We compared the 64 gene loci selected using the methylation data of the saliva samples against the methylated genes in the paired cancer tissue. Of these 64 gene loci, 41 gene loci from 34 genes were methylated in both preoperative saliva and cancer tissue and were selected for the final classifier. The gene name, location of probe, and gene function for these 41 gene loci comprising the classifier are listed in Table 3. Figure 1 depicts the number of samples that were methylated at each of these gene loci.
Methylated genes . | Loci and strand of methylation . | Function . |
---|---|---|
ADCYAP1 | P398_F | Cell signaling |
P455_R | ||
AGTR1 | P41_F | Cell signaling |
BMP3 | P56_R | Cell signaling, cell differentiation |
CEBPA | P706_F | Regulation of transcription, cell differentiation |
EPHA5 | E158_R | Cell signaling |
ERBB4 | P255_F | Cell signaling, cell proliferation, development |
ESR1 | E298_R | Regulation of transcription, cell signaling, cell growth |
ETV1 | P235_F | Regulation of transcription |
EYA4 | P508_F | Regulation of transcription |
FGF3 | E198_R | Cell cycle control, cell signaling, cell proliferation, development |
P171_R | ||
FGF8 | P473_F | Cell cycle control, cell signaling, cell proliferation, development |
FLT1 | E444_F | Angiogenesis, cell proliferation, cell differentiation |
P302_F | ||
P615_R | ||
GABRB3 | E42_F | Cell signaling |
GALR1 | E52_F | Cell signaling |
GAS7 | E148_F | Cell cycle control, cell differentiation, development |
HLF | E192_F | Regulation of transcription, development |
IHH | E186_F | Cell signaling, development |
IL11 | P11_R | Cell signaling, cell differentiation |
INSR | P1063_R | Development |
IRAK3 | P13_F | Cell signaling |
KDR | E79_F | Angiogenesis, cell signaling, cell differentiation |
P445_R | ||
NOTCH3 | E403_F | Regulation of transcription, cell signaling, cell differentiation |
NTRK3 | E131_F | Cell signaling, cell differentiation, development |
P636_R | ||
P752_F | ||
p16 | seq_47_S188_R | Cell cycle control |
PKD2 | P287_R | Cell adhesion, development |
PTCH2 | P37_F | Cell differentiation, development |
PXN | P308_F | Cell adhesion, cell signaling, development |
RASGRF1 | E16_F | Cell adhesion, development |
TFPI2 | P9_F | Blood coagulation |
TMEFF1 | P234_F | Cell signaling |
TNFSF10 | E53_F | Apoptosis |
TWIST1 | P44_R | Regulation of transcription, cell differentiation, development |
WNT2 | E109_R | Cell signaling, development |
WT1 | P853_F | Regulation of transcription, cell cycle control |
Methylated genes . | Loci and strand of methylation . | Function . |
---|---|---|
ADCYAP1 | P398_F | Cell signaling |
P455_R | ||
AGTR1 | P41_F | Cell signaling |
BMP3 | P56_R | Cell signaling, cell differentiation |
CEBPA | P706_F | Regulation of transcription, cell differentiation |
EPHA5 | E158_R | Cell signaling |
ERBB4 | P255_F | Cell signaling, cell proliferation, development |
ESR1 | E298_R | Regulation of transcription, cell signaling, cell growth |
ETV1 | P235_F | Regulation of transcription |
EYA4 | P508_F | Regulation of transcription |
FGF3 | E198_R | Cell cycle control, cell signaling, cell proliferation, development |
P171_R | ||
FGF8 | P473_F | Cell cycle control, cell signaling, cell proliferation, development |
FLT1 | E444_F | Angiogenesis, cell proliferation, cell differentiation |
P302_F | ||
P615_R | ||
GABRB3 | E42_F | Cell signaling |
GALR1 | E52_F | Cell signaling |
GAS7 | E148_F | Cell cycle control, cell differentiation, development |
HLF | E192_F | Regulation of transcription, development |
IHH | E186_F | Cell signaling, development |
IL11 | P11_R | Cell signaling, cell differentiation |
INSR | P1063_R | Development |
IRAK3 | P13_F | Cell signaling |
KDR | E79_F | Angiogenesis, cell signaling, cell differentiation |
P445_R | ||
NOTCH3 | E403_F | Regulation of transcription, cell signaling, cell differentiation |
NTRK3 | E131_F | Cell signaling, cell differentiation, development |
P636_R | ||
P752_F | ||
p16 | seq_47_S188_R | Cell cycle control |
PKD2 | P287_R | Cell adhesion, development |
PTCH2 | P37_F | Cell differentiation, development |
PXN | P308_F | Cell adhesion, cell signaling, development |
RASGRF1 | E16_F | Cell adhesion, development |
TFPI2 | P9_F | Blood coagulation |
TMEFF1 | P234_F | Cell signaling |
TNFSF10 | E53_F | Apoptosis |
TWIST1 | P44_R | Regulation of transcription, cell differentiation, development |
WNT2 | E109_R | Cell signaling, development |
WT1 | P853_F | Regulation of transcription, cell cycle control |
Clustering of Preoperative Saliva and Postoperative Saliva Samples Based on Methylated Genes
We did hierarchical clustering of the 41 gene loci in the classifier in preoperative and postoperative saliva samples. The Manhattan clustering metrics was used to cluster samples. This approach computes the dissimilarity of β values between two samples across the gene loci within the classifier. Figure 2 depicts the heat map for the preoperative and postoperative saliva samples relative to the 41 gene loci that were included in the classifier. The postoperative saliva samples clustered together, whereas the preoperative saliva samples clustered at the two extremes of the grid. Imperfect clustering reflects <100% specificity for certain gene loci in the classifier.
Sensitivity and Specificity for Gene Panels
The nine gene panels that we developed and their associated sensitivity and specificity are listed in Table 4. For each gene in Table 4, test positivity (gene methylated) relative to disease positivity (preoperative saliva in oral SCC patients) is listed beside sensitivity; test negativity (gene not methylated) relative to disease negativity (postoperative saliva in oral SCC patients + normal subjects) is listed beside specificity. The gene panels ranged in their sensitivity from 62% to 77% and in their specificity from 83% to 100%. The highest sensitivity of 77% was associated with a gene panel comprising the gene loci GABRB3_E42_F, IL11_P11_R, INSR_P1063, NOTCH3_E403_F, NTRK3_E131_F, and PXN_P308_F. This gene panel had a specificity of 87%, which was lower than two other gene panels. The overall sensitivity and specificity of the 41 gene loci were 77% and 35%, respectively. However, constructing gene panels from the 41 gene loci served to increase specificity while keeping sensitivity levels constant.
Gene panel . | Sensitivity (test positive/disease positive) . | Specificity (test negative/total samples) . |
---|---|---|
AGTR_P41_F | ||
ESR1_E298_R | ||
FLT1_E444_F | ||
NOTCH3_E403_F | 69% (9/13) | 83% (19/23) |
GABRB3_E42_F | ||
IL11_P11_R | ||
INSR_P1063_R | ||
NOTCH3_E403_F | ||
NTRK3_E131_F | ||
PXN_P308_F | 77% (10/13) | 87% (20/23) |
ERBB4_P255_F | ||
IL11_P11_R | ||
PTCH2_P37_F | ||
TMEFF1_P234_F | ||
TNFSF10_E53_F | ||
TWIST1_P44_R | 62% (8/13) | 100% (23/23) |
ADCYAP1_P455_R | ||
CEBPA_P706_F | ||
EPHA5_E158_R | ||
FGF3_E198_R | ||
HLF_E192_F | ||
IL11_P11_R | ||
INSR_P1063_R | ||
NOTCH3_E403_F | 69% (9/13) | 96% (22/23) |
AGTR1_P41_F | ||
BMP3_P56_R | ||
FGF8_P473_F | ||
NTRK3_E131_F | 62% (8/13) | 87% (20/23) |
ERBB4_P255_F | ||
FLT_P615_R | ||
INSR_P1063_R | ||
IRAK3_P13_F | ||
KDR_P445_R | ||
NTRK_P636_R | ||
PTCH2_P37_F | ||
PXN_P308_F | ||
RASGRF1_E16_F | ||
WT1_P853_F | 69% (9/13) | 78% (18/23) |
ESR1_E298_R | ||
ETV1_P235_F | ||
GAS7_E148_F | ||
IL11_P11_R | ||
PKD2_P287_R | ||
TMEFF1_P234_F | ||
WNT2_E109_R | 62% (8/13) | 83% (19/23) |
EPHA5_E158_R | ||
FGF3_P171_R | ||
GALR1_E52_F | ||
IL11_P11_R | ||
INSR_P1063_R | ||
KDR_E79_F | ||
p16_seq_47_S188_R | 62% (8/13) | 83% (19/23) |
AGTR1_P41_F | ||
ERBB4_P255_F | ||
EYA4_P508_F | ||
FLT1_P302_F | ||
IHH_E186_F | ||
NTRK3_E131_F | ||
NTRK3_P752_F | ||
TFPI2_P9_F | 62% (8/13) | 78% (18/23) |
Gene panel . | Sensitivity (test positive/disease positive) . | Specificity (test negative/total samples) . |
---|---|---|
AGTR_P41_F | ||
ESR1_E298_R | ||
FLT1_E444_F | ||
NOTCH3_E403_F | 69% (9/13) | 83% (19/23) |
GABRB3_E42_F | ||
IL11_P11_R | ||
INSR_P1063_R | ||
NOTCH3_E403_F | ||
NTRK3_E131_F | ||
PXN_P308_F | 77% (10/13) | 87% (20/23) |
ERBB4_P255_F | ||
IL11_P11_R | ||
PTCH2_P37_F | ||
TMEFF1_P234_F | ||
TNFSF10_E53_F | ||
TWIST1_P44_R | 62% (8/13) | 100% (23/23) |
ADCYAP1_P455_R | ||
CEBPA_P706_F | ||
EPHA5_E158_R | ||
FGF3_E198_R | ||
HLF_E192_F | ||
IL11_P11_R | ||
INSR_P1063_R | ||
NOTCH3_E403_F | 69% (9/13) | 96% (22/23) |
AGTR1_P41_F | ||
BMP3_P56_R | ||
FGF8_P473_F | ||
NTRK3_E131_F | 62% (8/13) | 87% (20/23) |
ERBB4_P255_F | ||
FLT_P615_R | ||
INSR_P1063_R | ||
IRAK3_P13_F | ||
KDR_P445_R | ||
NTRK_P636_R | ||
PTCH2_P37_F | ||
PXN_P308_F | ||
RASGRF1_E16_F | ||
WT1_P853_F | 69% (9/13) | 78% (18/23) |
ESR1_E298_R | ||
ETV1_P235_F | ||
GAS7_E148_F | ||
IL11_P11_R | ||
PKD2_P287_R | ||
TMEFF1_P234_F | ||
WNT2_E109_R | 62% (8/13) | 83% (19/23) |
EPHA5_E158_R | ||
FGF3_P171_R | ||
GALR1_E52_F | ||
IL11_P11_R | ||
INSR_P1063_R | ||
KDR_E79_F | ||
p16_seq_47_S188_R | 62% (8/13) | 83% (19/23) |
AGTR1_P41_F | ||
ERBB4_P255_F | ||
EYA4_P508_F | ||
FLT1_P302_F | ||
IHH_E186_F | ||
NTRK3_E131_F | ||
NTRK3_P752_F | ||
TFPI2_P9_F | 62% (8/13) | 78% (18/23) |
Discussion
In this study, we applied a high throughput and reproducible array to analyze DNA methylation across the entire genome in the saliva of oral cancer patients. We have quantified DNA methylation in both preoperative and postoperative saliva samples using a genome wide methylation array, after confirmed total resection of the oral cancer. The methylation array results were used to build a classifier, consisting of 41 gene loci from 34 genes. The classifier was constructed using methylation results from preoperative saliva, postoperative saliva, associated cancer tissue, and saliva from normal subjects. The genes within the classifier are involved in cell signaling, cell differentiation, cell proliferation, cell cycle control, regulation of transcription, angiogenesis, cell adhesion, blood coagulation, apoptosis, and development. The classifier was used to construct multiple potential diagnostic gene panels that consisted of 4 to 7 genes. The sensitivity and specificity, listed in Table 4, of these gene panels were found to be significantly higher than previous studies looking at saliva in head and neck cancer patients (8, 17, 21). The methylation array of 807 genes that we used in this study included the majority of genes that have previously been associated with head and neck cancer (e.g., APC, DAPK, and MGMT). However, only three of these previously studied genes showed positive methylation (CDKN2A, ESR, and NOTCH 3). One possible explanation for this discrepancy between previous methylation studies and our current methylation array study lies in probe design and the differences in regions where these probes hybridize. For example, for the genes APC, DAPK, and MGMT, the probes on the array are designed for regions further upstream relative to other techniques measuring methylation such as MSPCR. Because of this probe design, an increase in methylation is required for hybridization on the array. Such probe design leads to fewer false positive and is more likely to indicate methylation levels that are biologically relevant. We propose that the gene panels presented in Table 4 could be evaluated in a focused study targeted at optimizing a clinical test. One possible approach to using the gene panel in a clinical setting is quantitative MSPCR. Quantitative MSPCR has significant advantages over other methods such as combined bisulfite restriction analysis and pyrosequencing because (a) the resultant values lie along a continuum and the data are quantitative, and (b) Goldenberg et al. (9) showed that promoter methylation can be quantified with MSPCR in 4 to 5 hours in a clinical setting. Methylight is a recently developed, related MSPCR method that uses a mathematical equation to interpret results obtained from quantitative PCR to yield methylation status. We have used Methylight to quantify promoter hypermethylation in the saliva of oral cancer and dysplasia patients (17). Methylight could be hyphenated with multiplexing PCR allowing an investigator or laboratory to evaluate all genes within one gene panel in one run in a cost effective manner that could be translated to patient management.
The methylation threshold chosen for our study was based on the available studies that have evaluated percentage of promoter methylation that leads to a biologically relevant event as evidenced by loss of protein expression as well as our goal to achieve acceptable sensitivity and specificity. Shaw et al. (22) observed that for the set of genes in their study, normal tissue samples tended to show background methylation of 0% to 5%. The authors therefore classified all cancer samples with >5% methylation as positive for methylation. They also showed that methylation levels of >5% led to loss of protein expression. Ogino et al. (12) showed that a percentage of methylated reference (i.e., degree of methylation) of >4% is associated with loss of respective protein expression. We also based our β value cutoffs on the study by Bibikova et al. (14), which validated the GoldenGate Methylation Array technique. Two specific validation techniques out of the four that they used in this study influenced our choice of cutoffs for negative and positive methylation. First, Bibikova et al. (14) designed a technique that estimates the ability of the array to detect methylation differences between samples. To do this, they measured methylation in six “gender-specific” CpG sites, which were only methylated on the X chromosome and not on the Y chromosome. The authors titrated male DNA with varying levels of female DNA. Their results showed that for the sample with 100% male DNA, which should show no methylation, β values were <0.10. From this validation technique, we concluded that β values up to 0.10 should be considered negative for methylation. The second validation technique that we focused our attention to was comparison of array β values to percentage of methylation in the gene obtained from bisulfite sequencing. Bisulfite sequencing remains the gold standard for detecting methylation in a gene and is therefore the most reliable indicator of percentage of methylation. When the β values of six samples across eight genes from the array were compared with their percentage of methylation obtained from bisulfite sequencing, β values of <0.20 showed 0% methylation in all cases except 1 sample at 1 gene. This result led us to conclude that our cutoff for positive methylation should have a β value of >0.20.
Saliva is the ideal diagnostic and predictive biofluid for head and neck cancer. One potential advantage of saliva as a diagnostic and predictive fluid is that it samples cells from the entire lesion and the entire oral cavity because the saliva bathes the oromucosal surfaces. A scalpel biopsy, on the other hand, is prone to sampling error given that only a small portion of the lesion is sampled. One drawback of saliva is that variations within the technical protocol can profoundly affect the quality and reproducibility of the results. The condition under which the saliva is collected determines the composition and concentration of gene and protein products within the saliva. In particular, there are significant diurnal variations within saliva, as well as changes that occur with eating and drinking (23). To control for these variables in the current study, we required that all saliva samples were taken between the morning hours of 6:30 to 8:00 a.m. when the patient had nothing to eat or drink since midnight the night before. This protocol was used for cancer patients at the preoperative and postoperative times of collection, as well as, the normal subjects. The most technically sensitive step is DNA extraction from the whole saliva. We explored multiple DNA extraction methods and compared the yield and quality of the extracted DNA using spectrometry (Nanodrop Technologies). We selected the automated DNA extraction protocol described in the Materials and Methods because it yielded the highest quality and quantity of DNA. After extraction, an accurate differentiation of methylated and unmethylated DNA necessitates highly efficient bisulfite conversion. Our bisulfite conversion protocol has 99.7% efficiency and is the recommended method to be used with the Illumina GoldenGate Methylation Array (14). In patients with oral cancer, the saliva contains both cancer cells and normal cells; therefore, DNA will be extracted from both cell types. On the other hand, DNA extracted from cancerous tissue in these patients would primarily be from the cancerous cells. We have shown high rates of concordance between matched saliva and tissue in genes explored in our previous studies (17).
Other studies have used promoter hypermethylation in saliva to detect head and neck cancer. For example, Carvalho et al. (8) used quantitative MSPCR on 21 genes to analyze both saliva and serum in 211 head and neck SCC patients and 527 normal controls. Using this approach, the authors showed promoter hypermethylation in SCC patients compared with normal subjects. The authors chose panels of 2 to 5 genes that were subsets of the 21 genes. One of their gene panels had a specificity of as high as 92.5%; however, the sensitivity of this panel was 31.4%. The genes selected by the authors were chosen from previous studies on promoter methylation in head and neck SCC patients (8). Despite the association between the selected genes and head and neck SCC, their inclusion did not yield a clinically acceptable sensitivity. Therefore, in the current study, we used a methylation array to discover genes that have not been associated with head and neck SCC but might have a high likelihood of being methylated in the preoperative saliva samples of cancer patients. Our aim for this approach is to increase sensitivity and specificity. An additional difference with our study is that we focused on only oral cavity SCCs and did not include other head and neck sites. Saliva is the proximal fluid for oral SCC. Previous studies using saliva as a detection biofluid for head and neck SCC, such as Carvalho et al. (8), Righini et al. (21), and Rosas et al. (5), included patients with SCCs of the oral cavity, larynx, and pharynx. SCCs from these different subsites are known to have significant biological variability and it is unclear whether saliva would contain an appropriate concentration of carcinoma cells from laryngeal and pharyngeal sites.
The hierarchical clustering done on the heat map generated from β values shows a bimodal distribution of preoperative saliva samples. The tight clustering of the posteroperative saliva samples within the heat map, which show almost no methylation in the 41 gene classifier, partially explains the bimodal distribution of preoperative saliva samples. Furthermore, the bimodal distribution of the preoperative saliva samples that resulted from hierarchical clustering in the current study implies two distinct groups of methylation patterns present within oral SCCs. This finding of molecular heterogeneity is consistent with our genomic results showing significant genetic difference among oral SCCs and clustering into two distinct groups (24). The heterogeneous nature of oral SCCs leads to sparseness of genomic and epigenetic data, even in studies with a large number of patients (8, 21), making the development of a biomarker prohibitive. We used a two-part approach to deal with the sparseness of the data. To begin, we used a methylation array of 807 genes, which led to more comprehensive data. Second, patients served as their own controls. This approach significantly reduces the number of patients required. The comparison of the matched preoperative and postoperative samples is the most accurate method to identify cancer specific gene methylation in saliva. When only normal subjects are used as the control, intersubject variability likely leads to selection of hypermethylated genes, which might not be associated with the cancer. In almost all studies, which have looked at genetic or epigenetic changes in the saliva of head and neck cancer patients, the control group has consisted of normal subjects. To our knowledge, we are the only group to evaluate promoter methylation in both preoperative and postoperative saliva in oral SCC patients. Righini et al. (21) analyzed both preoperative and postoperative saliva samples but 85 of 90 patients evaluated had verrucous carcinoma that is biologically dissimilar to SCC. We used the postoperative sample to identify methylated genes that would be present in the saliva of oral cancer patients who are cancer free. Using postoperative saliva samples as we have done in the current study adds a level of stringency to the gene classifier selection process. Our goal in performing genome-wide methylation array analysis on both preoperative and postoperative saliva was to discover new genes that could be part of a reliable gene classifier. We propose that a gene classifier produced with this approach could ultimately be used in early diagnosis and prediction of outcome in patients with oral SCC.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Grant support: NIDCR K12 DE14609 (Western Oral Research Consortium) and Oral and Maxillofacial Surgery Foundation.
Acknowledgments
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.