Abstract
Discovery of methylated DNA markers (MDM) of esophageal squamous cell carcinoma (ESCC) has sparked interest in assessing these markers in tissue. We evaluated MDMs in ESCC from three geographically and ethnically distinct populations, and explored the feasibility of assaying MDMs from DNA obtained by swallowed balloon devices.
MDMs were assayed in ESCC and normal tissues obtained from the populations of United States, Iran, and China, and from exfoliative cytology specimens obtained by balloons in a Chinese population. Areas under the receiver operating curve (AUC) of MDMs discriminating ESCC from normal tissues were calculated. Random forest prediction models were built, trained on U.S. cases and controls, and calibrated to U.S.-only controls (model 1) and three-country controls (model 2). Statistical tests were used to assess the relationship between dysplasia and MDM levels in balloons.
Extracted DNA from 333 ESCC and 322 normal tissues was analyzed, in addition to archival DNA from 98 balloons. For ESCC, model 1 validated in Iranian and Chinese tissues with AUCs of 0.90 and 0.87, and model 2 yielded AUCs of 0.99, 0.96, and 0.94 in tissues from the United States, Iran, and China, respectively. In Chinese balloons, MDMs showed a statistically significant trend of increasing levels with increasing grades of dysplasia (P < 0.004).
MDMs accurately discriminate ESCC from normal esophagus in tissues obtained from high- and low-incidence countries. Preliminary data suggest that levels of MDMs assayed in DNA from swallowed balloon devices increase with dysplasia grade. Larger studies are needed to validate these results.
MDMs coupled with minimally invasive collection methods have the potential for worldwide application in ESCC screening.
Introduction
Esophageal cancer is responsible for an estimated 509,000 deaths worldwide in 2018 (1). The majority of these cases occur in high-risk populations in Asia, Africa, and South America (2). Most esophageal cancers are diagnosed after the onset of symptoms signifying advanced stage (2–4). Hence, survival remains poor (3). In contrast, early esophageal cancer can be treated endoscopically or surgically with excellent survival approaching 90% at 5 years (4–6).
About 87% of esophageal cancer cases worldwide are esophageal squamous cell carcinoma (ESCC; ref. 2), which arises in the background of squamous dysplasia. Longitudinal studies have shown increasing risks of progression in those with mild, moderate, and severe dysplasia (7, 8). Successful endoscopic treatment of dysplasia has been demonstrated in recent studies (9–12), opening the possibility of preventing ESCC by finding and treating severe dysplasia in high-risk individuals.
Because squamous dysplasia and early ESCC are nearly always asymptomatic, they can currently be diagnosed only by endoscopy, which may reveal only subtle mucosal abnormalities. Although endoscopic screening has been demonstrated to reduce ESCC mortality (13), it requires considerable medical infrastructure and highly trained personnel, and therefore, is not suitable for widespread screening for ESCC in limited resource settings.
To overcome these limitations, various nonendoscopic devices (e.g., swallowed balloons, meshes, and sponges) have been developed to sample exfoliated cells from the entire esophageal mucosa. Most studies utilizing these devices were conducted in China, using cytology to diagnose dysplasia and cancer. Unfortunately, the sensitivity (≤45%) and specificity (≤82%) of cytology on these specimens were inadequate for screening (14, 15).
More recently, newer generations of nonendoscopic esophageal sampling devices have been combined with protein and methylated DNA markers (MDM) for the detection of Barrett esophagus and esophageal adenocarcinoma (16, 17). Methylation of specific CpG loci is strongly associated with cancer; these MDMs have demonstrated utility as biomarkers for early detection of cancer and precancer (18, 19). Potential application of this technology has led to renewed interest in nonendoscopic detection of esophageal squamous dysplasia and ESCC, further supported by the discovery of highly discriminant MDMs for ESCC in tumor tissue from U.S. tissue samples (20, 21).
Compared with screening by endoscopy, a nonendoscopic esophageal tissue sampling device coupled with biomarkers has the potential to: (i) overcome the limitations of sampling error by sampling the entire esophageal epithelium, (ii) eliminate the factor of operator dependency related to endoscopy in detecting subtle lesions, (iii) improve the sensitivity and specificity of cytology alone from previous esophageal sampling studies, and (iv) increase patient access, by reducing cost and procedure-related risks. A screening tool combining recent improvements in nonendoscopic esophageal sampling devices with novel molecular markers for early ESCC, validated in populations with low and high prevalence of ESCC, could be transformative.
The epidemiology and risk factors of ESCC, however, differ significantly in geographically distinct populations. In Western countries such as the United States, where the incidence of ESCC is low, ESCC is 3–4 times more common in men, and the main risk factors for ESCC are smoking and alcohol consumption. However, in regions where the incidence of ESCC is high, such as Iran and China, ESCC is only 1.2–2 times more common in men, tobacco and alcohol consumption are less important, and the main risk factors appear to be polycyclic aromatic hydrocarbon exposure from nontobacco sources, drinking liquids at high temperatures, poor diets, and poor oral hygiene (22–24). Although novel MDMs for ESCC were discovered and validated in tissue cohorts from the United States, a low-incidence country, their application in countries with high incidence of ESCC, where the suspected etiology of ESCC may be different, has yet to be explored.
Therefore, in this study, we aimed: (i) to independently validate the ability of previously discovered MDMs to discriminate ESCC from normal esophageal tissue in the United States, where incidence of ESCC is low, (ii) to evaluate the performance of these MDMs in countries with high incidence of ESCC (Iran and China) to assess potential worldwide application, and (iii) in an exploratory pilot study, to determine the feasibility of assaying these MDMs and assessing their levels in archival DNA extracted from samples collected by swallowed balloon devices obtained from patients with varying grades of squamous dysplasia.
Materials and Methods
Preface
In a previous discovery effort in ESCC tissue from U.S. patients, we performed reduced representation bisulfite sequencing (RRBS) on DNA extracted from 10 frozen ESCCs and nine normal tissues to identify novel differentially methylated regions (DMR) of the genome, followed by technical validation using quantitative methylation-specific PCR (qMSP) on the same frozen samples. Technically validated DMRs were then biologically validated in independent formalin-fixed, paraffin-embedded (FFPE) tissue samples (35 ESCCs vs. 17 normal tissues). DMRs showing highest discrimination by a priori criteria (TBX15, ARHGEF4, ZNF132, TSPYL5, ZNF781, GRIN2D, OPLAH, ZNF610, and MAX.chr19.40314899.40315491) were carried forward as candidate MDMs for this study (20). Three additional markers, EMX1, LOC645323, and VWC2, which were high-performing markers for other epithelial cancers of the gastrointestinal (GI) tract, were validated in a separate experiment on esophageal cancer tissues; from this experiment, LMX1A and SLC32A1 were also chosen after reanalysis of this data excluding normal colonic tissue as part of the controls (21). Thus, all of these 14 candidate MDMs were selected for our study.
Patients and samples
This was a retrospective study, where all tissues used were archival FFPE tissues containing ESCC or normal esophagus pathology from the United States, Iran, or China. The flow of each aim of the study is demonstrated in Fig. 1. The STARD checklist is provided in Supplementary Table S1.
Study flow diagram. Aim 1 study flow (A), aim 2 study flow (B), and aim 3 study flow (C). ng, nanogram.
Study flow diagram. Aim 1 study flow (A), aim 2 study flow (B), and aim 3 study flow (C). ng, nanogram.
The U.S. samples were from patients independent of those used in prior discovery studies, and came from the Mayo Clinic Tissue Registry, an archive of clinical tissue specimens maintained by the Mayo Clinic Department of Anatomic Pathology (Rochester, MN). The Iranian tissue samples were collected between 2003 and 2007 as part of the Golestan Case–Control Study in Golestan Province, Iran, a collaboration between the NCI (Bethesda, MD), the International Agency for Research on Cancer (IARC, Lyon, France), and the Digestive Diseases Research Institute of Tehran University of Medical Sciences (Tehran, Iran; ref. 23). The Chinese tissue samples were collected between 1997 and 2005, as part of the Upper GI Cancer Genetics Study in Shanxi Province, a collaboration between NCI (Bethesda, MD), Shanxi Cancer Hospital (Taiyuan, Shanxi, China), and Yangcheng Cancer Hospital (Yangcheng, Shanxi, China; ref. 25). The Chinese balloon samples came from the Cytology Sampling Study 2 collected in 2002, a substudy of the Early Detection of Esophageal Cancer research project, a collaboration between the NCI (Bethesda, MD) and the Cancer Institute of the Chinese Academy of Medical Sciences (Beijing, China; ref. 14). Balloon samples came from patients with worst histologic diagnoses of normal mucosa, lymphocytic esophagitis, mild squamous dysplasia (m-SDYS), moderate squamous dysplasia (M-SDYS), severe squamous dysplasia (S-SDYS), and ESCC (14).
Eligibility criteria for patients in Iran and China are included in source studies as referenced. In the U.S. cohort, patients were included if (i) they had a histologic diagnosis of ESCC (case) or normal esophagus and (ii) the diagnostic biopsy specimens were available for review. Patients were excluded if they had: (i) eosinophilic esophagitis, achalasia, Barrett esophagus, or esophageal adenocarcinoma, (ii) prior history of chemotherapy or radiation to the chest or mediastinum for treatment of any malignancy, or (iii) prior endoscopic or surgical treatment for esophageal cancer. All histologic diagnoses used to categorize tissue samples or balloon sample patients were confirmed by the study pathologists (T.C. Smyrk and S.M. Dawsey). All studies were approved by national institutional review boards (IRB) in the host countries and IRBs at the NCI (Bethesda, MD) and IARC (Lyon, France, where applicable). In the source studies from Iran and China, all endoscopic biopsies were obtained following Lugol chromoendoscopy (26). All laboratory and statistical analyses were conducted after approval by the Mayo Clinic IRB (Rochester, MN).
Laboratory methods and assays
FFPE tissues from the Iranian and Chinese studies were transferred to Mayo Clinic (Rochester, MN) from NCI (Bethesda, MD). Hematoxylin and eosin sections were reviewed by study pathologists to identify areas of ESCC for macrodissection. Up to 3 × 1 mm cores per FFPE block from the area marked by the study pathologist were utilized for DNA extraction. DNA was extracted using the QIAamp FFPE Tissue Kit (Qiagen). DNA yield was measured fluorometrically on a plate reader with Quant-iT Pico-Green dsDNA Assay Reagents (Invitrogen). Tissue with DNA yield <50 ng was excluded from analyses and replaced with additional samples if available. Subsequently, samples were bisulfite converted using the Zymo EZ-96 DNA Methylation Kit (Zymo Research) and amplified with SYBR Green I detection using the LightCycler 480 (Roche Diagnostics).
MSP primers for candidate MDMs were designed with Methprimer software (University of California, San Francisco, CA) as described previously (27). Oligonucleotides were synthesized by Integrated DNA Technologies. Prior to use, qMSP assays were quality control tested on bisulfite-converted and -unconverted methylation (±) controls to verify amplification. Assay standards were dilutions of bisulfite-converted and M.SssI-treated genomic DNA, which ensured absolute quantification. Results were expressed as fractional methylation using the bisulfite-treated PCR product of ACTB (beta-actin) as a marker for total DNA in each sample, and analyzed logistically.
All balloon cytology samples were snap-frozen in liquid nitrogen at the field site in Linxian, China and shipped frozen to NCI (Bethesda, MD), where the Gentra Puregene Cell Kit (Qiagen) was used to extract DNA from 300 μL of cell suspension. The DNA quality and quantity were checked using the 260:280 ratio, NanoDrop and PicoGreen. The presence of human and bacterial DNA in the samples was verified by TaqMan assays using species‐specific primers. Samples with DNA yield <50 ng were excluded from analysis. Similar to the methods above, DNA was then bisulfite converted and amplified, and candidate MDMs were assayed using qMSP, normalized to ACTB, and analyzed logistically.
Laboratory investigators were blinded to the clinical diagnoses of each sample.
Statistical analysis
Using a two-sided test of significance of 5% and 80% power, a sample size of 99 per disease group per country was needed to detect a minimum area under the receiver operating curve (AUC) of 0.614 compared with the null AUC of 0.5.
Individual MDM levels were standardized by methylated ACTB (a surrogate measure for total methylated DNA) using a generalized additive model (GAM) to fit a nonlinear calibration curve for each MDM using normal controls.
This approach was chosen following the observed relationship between the targeted MDM and methylated ACTB to correct for different loadings of methylated DNA, bisulfite conversion rates, and capture efficiency (28). Because many of the relationships between the targeted MDM and methylated ACTB are nonlinear, we elected to use GAM, a well-established statistical method of fitting nonlinear relationships between two variables (29).
These estimated calibration curves were then applied to all samples. The distribution of the MDM levels for each disease group is displayed using box plots, with MDM levels represented by scaled copy numbers, which are the exponentiation of the GAM standardized MDM values divided by the marker's SD.
For aim 1, MDMs were independently validated in the U.S. cohort, and AUCs of individual MDMs were calculated.
For aim 2, in addition to AUCs of individual MDMs, random forest classification, using default parameters averaging predictions across 500 bootstrap samples of the dataset, was used to build multivariate predictor models of ESCC using all candidate MDMs (30). Model 1 was designed to be independently validated in international cohorts, and therefore, was trained on only U.S. cases and controls. First, individual MDM levels for all U.S. subjects were standardized to methylated ACTB using the GAM estimated from U.S. normal controls as described above. Then, random forest was fit on these data. Leave-one-out cross-validation was used to estimate model performance in the U.S. cohort. The resulting random forest model (model 1) was then applied independently to the datasets of the other two countries (Iran and China) to validate this model's prediction of disease status (ESCC vs. normal) in these high-incidence populations.
Because of concerns regarding uncontrollable differences by country in potentially important aspects of the study (e.g., patient populations, sample collection, processing, and storage), a second random forest model (model 2) was also developed. For this model, GAM standardization to methylated ACTB levels was performed using the controls from all three countries. As defined above, a random forest model was then developed on the basis of the U.S. data, and applied to data from Iran and China. The performance of both random forest models was represented in ROC curves, with AUCs calculated for discrimination of ESCC from normal tissue.
Random forest classification was used to model the relationship between the panel of MDMs and disease status because it has been shown to provide superior generalizability and predictive accuracy in test datasets, compared with logistic regression with minimal concerns of overfitting (30, 31).
In the Chinese balloon samples, box plots were used to show the distribution of individual MDMs in patients with different histologic diagnoses. Because of the small numbers in each of the patient diagnoses, we combined patients into three clinically relevant categories: no dysplasia (combining those with normal esophagus or lymphocytic esophagitis), mild/moderate dysplasia, and severe dysplasia/ESCC (the actionable targets of endoscopic eradication therapy; ref. 32). A Jonckheere-Terpstra test for ordered differences was used to test whether MDM levels increase with higher levels of dysplasia, and a Bonferroni corrected P value cutoff was used to evaluate for significance in trend. The packages GAM and randomForest were used for GAM standardization and random forest modeling, respectively (33, 34). All analyses were performed using RStudio version 3.4.2 (35).
Results
Esophageal mucosal tissue samples from 947 individuals from three countries (288 from the United States, 348 from Iran, and 279 from China) were analyzed. Of these, samples from 655 individuals (181 from the United States, 248 from Iran, and 226 from China) yielded adequate amounts of DNA, with 333 ESCC and 322 normal samples (Fig. 1; Table 1). All of the normal tissues came from patients with no endoscopic or histologic evidence of esophagitis, dysplasia, or ESCC. For the United States, Iran, and China, median ages and proportions of men in all samples combined were 65, 62, and 51 years and 55%, 47%, and 43%, respectively. Staging information was limited to the U.S. cases: 21% stage I, 29% stage II, 33% stage III, and 17% stage IV. It is likely that the majority of the Iranian and Chinese cancers were diagnosed at advanced stages because they were all symptomatic clinical cases. Among the U.S. cancer cases, 54% were smokers and 66% had a history of heavy alcohol use (36). Among the Iranian cases, 13% were smokers and only 1% ever drank alcohol (due to religious and cultural practice). These data were not available for the Chinese cases. Tissue DNA content was significantly lower in the U.S. samples than in the Iranian and Chinese samples (Supplementary Table S2).
Baseline characteristics.
. | United States (tissue) . | Iran (tissue) . | China (tissue) . | China (balloon) . | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | ESCC N = 86 . | Controls N = 95 . | P . | ESCC N = 114 . | Controls N = 134 . | P . | ESCC N = 133 . | Controls N = 93 . | P . | ESCC N = 5 . | S-SDYS N = 10 . | M-SDYS N = 20 . | m-SDYS N = 20 . | Esophagitis N = 18 . | Normal N = 20 . |
Median age (IQR) | 68 (62–71) | 49 (43–60) | <0.001a | 62 (55–70) | 60 (54–69) | 0.21a | 51 (46–58) | 49 (45–54) | 0.02a | 55 (50–61) | 59 (51–63) | 54 (48–64) | 56 (50–65) | 53 (51–56) | 54 (52- 58) |
Male sex (n, %) | 57 (66%) | 39 (41%) | 0.001b | 48 (42%) | 63 (47%) | 0.51b | 68 (51%) | 27 (29%) | 0.002b | 3 (60%) | 7 (70%) | 10 (50%) | 6 (30%) | 4 (22%) | 7 (35%) |
. | United States (tissue) . | Iran (tissue) . | China (tissue) . | China (balloon) . | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | ESCC N = 86 . | Controls N = 95 . | P . | ESCC N = 114 . | Controls N = 134 . | P . | ESCC N = 133 . | Controls N = 93 . | P . | ESCC N = 5 . | S-SDYS N = 10 . | M-SDYS N = 20 . | m-SDYS N = 20 . | Esophagitis N = 18 . | Normal N = 20 . |
Median age (IQR) | 68 (62–71) | 49 (43–60) | <0.001a | 62 (55–70) | 60 (54–69) | 0.21a | 51 (46–58) | 49 (45–54) | 0.02a | 55 (50–61) | 59 (51–63) | 54 (48–64) | 56 (50–65) | 53 (51–56) | 54 (52- 58) |
Male sex (n, %) | 57 (66%) | 39 (41%) | 0.001b | 48 (42%) | 63 (47%) | 0.51b | 68 (51%) | 27 (29%) | 0.002b | 3 (60%) | 7 (70%) | 10 (50%) | 6 (30%) | 4 (22%) | 7 (35%) |
Abbreviations: IQR, interquartile range; N, number.
aWithin countries, median ages were significantly different (P < 0.05) for ESCC versus normal controls in the United States and China, but not for Iran.
bWithin countries, % males was significantly different (P < 0.05) between ESCC and normal tissue for the United States and China, but not for Iran.
Fourteen novel MDMs previously identified in tissue from U.S. subjects were assayed in this study (Table 2; refs. 20, 21); the names and functions of genes harboring these MDM sequences are summarized in Supplementary Table S3. For aim 1, individual MDMs performed well in independent U.S. samples with AUCs approaching 1 (Table 2). When applied to tissues from Iran and China, individual MDMs continued to perform well, discriminating ESCC from normal tissue (Table 2; three representative markers shown in Fig. 2). In the random forest classification analyses, model 1 (with all 14 MDMs assayed), normalized only to U.S. controls, had AUCs of 0.99 [95% confidence interval (CI), 0.98–1.00], 0.90 (95% CI, 0.87–0.94), and 0.87 (95% CI, 0.81–0.92) in U.S., Iranian, and Chinese tissues, respectively (Table 2). Model 2, normalized to controls from all three countries, yielded AUCs of 0.99 (95% CI, 0.98–1.00), 0.96 (95% CI, 0.93–0.98), and 0.94 (95% CI, 0.90–0.97), in U.S., Iranian, and Chinese tissues, respectively (Table 2). At 90% specificity, cross-validated sensitivities for detecting ESCC in the U.S. cohort were 97.6% and 98.8% for models 1 and 2, respectively. Figure 3 shows ROC curves for the two models for each country.
Discrimination of ESCC from normal tissue by individual MDMs, and by models 1 and 2; AUCs with 95% CI are shown.
MDM or model . | U.S. tissue samples (ESCC vs. normal) AUC (95% CI) . | Iranian tissue samples (ESCC vs. normal) AUC (95% CI) . | Chinese tissue samples (ESCC vs. normal) AUC (95% CI) . |
---|---|---|---|
TBX15 | 0.99 (0.98–1) | 0.93 (0.89–0.96) | 0.93 (0.89–0.97) |
SLC32A1 | 0.98 (0.95–1) | 0.90 (0.86–0.94) | 0.82 (0.76–0.87) |
LMX1A | 0.97 (0.94–0.99) | 0.92 (0.89–0.96) | 0.90 (0.85–0.94) |
ARHGEF4 | 0.97 (0.95–0.99) | 0.78 (0.72–0.84) | 0.86 (0.81–0.91) |
ZNF132 | 0.94 (0.9–0.98) | 0.97 (0.95–0.99) | 0.90 (0.85–0.94) |
TSPYL5 | 0.94 (0.9–0.98) | 0.92 (0.88–0.96) | 0.85 (0.8–0.9) |
ZNF781 | 0.94 (0.9–0.97) | 0.93 (0.9–0.97) | 0.83 (0.78–0.89) |
GRIN2D | 0.93 (0.89–0.97) | 0.93 (0.89–0.96) | 0.93 (0.9–0.97) |
OPLAH | 0.88 (0.83–0.94) | 0.95 (0.92–0.97) | 0.88 (0.83–0.93) |
LOC645323 | 0.88 (0.82–0.94) | 0.90 (0.86–0.95) | 0.88 (0.83–0.93) |
ZNF610 | 0.88 (0.83–0.94) | 0.91 (0.86–0.95) | 0.81 (0.75–0.87) |
EMX1 | 0.87 (0.81–0.93) | 0.95 (0.92–0.98) | 0.81 (0.76–0.87) |
MAX.chr19 | 0.74 (0.66–0.81) | 0.90 (0.86–0.94) | 0.72 (0.66–0.79) |
40314899 | |||
40315491 | |||
vwc2 | 0.68 (0.6–0.76) | 0.59 (0.51–0.66) | 0.69 (0.63–0.76) |
Model 1 | 0.99 (0.98–1.00) | 0.90 (0.87–0.94) | 0.87 (0.81–0.92) |
Model 2 | 0.99 (0.98–1.00) | 0.96 (0.93–0.98) | 0.94 (0.90–0.97) |
MDM or model . | U.S. tissue samples (ESCC vs. normal) AUC (95% CI) . | Iranian tissue samples (ESCC vs. normal) AUC (95% CI) . | Chinese tissue samples (ESCC vs. normal) AUC (95% CI) . |
---|---|---|---|
TBX15 | 0.99 (0.98–1) | 0.93 (0.89–0.96) | 0.93 (0.89–0.97) |
SLC32A1 | 0.98 (0.95–1) | 0.90 (0.86–0.94) | 0.82 (0.76–0.87) |
LMX1A | 0.97 (0.94–0.99) | 0.92 (0.89–0.96) | 0.90 (0.85–0.94) |
ARHGEF4 | 0.97 (0.95–0.99) | 0.78 (0.72–0.84) | 0.86 (0.81–0.91) |
ZNF132 | 0.94 (0.9–0.98) | 0.97 (0.95–0.99) | 0.90 (0.85–0.94) |
TSPYL5 | 0.94 (0.9–0.98) | 0.92 (0.88–0.96) | 0.85 (0.8–0.9) |
ZNF781 | 0.94 (0.9–0.97) | 0.93 (0.9–0.97) | 0.83 (0.78–0.89) |
GRIN2D | 0.93 (0.89–0.97) | 0.93 (0.89–0.96) | 0.93 (0.9–0.97) |
OPLAH | 0.88 (0.83–0.94) | 0.95 (0.92–0.97) | 0.88 (0.83–0.93) |
LOC645323 | 0.88 (0.82–0.94) | 0.90 (0.86–0.95) | 0.88 (0.83–0.93) |
ZNF610 | 0.88 (0.83–0.94) | 0.91 (0.86–0.95) | 0.81 (0.75–0.87) |
EMX1 | 0.87 (0.81–0.93) | 0.95 (0.92–0.98) | 0.81 (0.76–0.87) |
MAX.chr19 | 0.74 (0.66–0.81) | 0.90 (0.86–0.94) | 0.72 (0.66–0.79) |
40314899 | |||
40315491 | |||
vwc2 | 0.68 (0.6–0.76) | 0.59 (0.51–0.66) | 0.69 (0.63–0.76) |
Model 1 | 0.99 (0.98–1.00) | 0.90 (0.87–0.94) | 0.87 (0.81–0.92) |
Model 2 | 0.99 (0.98–1.00) | 0.96 (0.93–0.98) | 0.94 (0.90–0.97) |
Note: Model 1 was a random forest model trained on the U.S. cohort with GAM normalization to U.S. controls only; model 2 was a random forest model trained on the U.S. cohort, with GAM normalization to controls from all three countries. Both models were applied to ESCC and normal tissues from all three countries.
Box plot distributions of representative markers GRIN2D, TBX15, and ZNF132 (A–C, respectively). The x-axis labels the normal and ESCC bars of the three countries, and the y-axis shows the MDM levels (scaled copy numbers of each marker, standardized by ACTB). AUCs of individual markers and 95% CIs are also shown. The “scaled copy number” is the exponentiation of the GAM standardized MDM values divided by the marker's SD.
Box plot distributions of representative markers GRIN2D, TBX15, and ZNF132 (A–C, respectively). The x-axis labels the normal and ESCC bars of the three countries, and the y-axis shows the MDM levels (scaled copy numbers of each marker, standardized by ACTB). AUCs of individual markers and 95% CIs are also shown. The “scaled copy number” is the exponentiation of the GAM standardized MDM values divided by the marker's SD.
Overall discrimination of ESCC versus normal tissue by MDM models assayed from tissue from three countries (United States, A; Iran, B; and China, C). AUCs and 95% CIs for the random forest model 1 and model 2 are shown. Model 1 was a random forest model trained on the U.S. cohort with GAM normalization to U.S. controls only, and model 2 was a random forest model trained on the U.S. cohort, with GAM normalization to controls from all three countries. Both models were applied to ESCC and normal tissues from all three countries.
Overall discrimination of ESCC versus normal tissue by MDM models assayed from tissue from three countries (United States, A; Iran, B; and China, C). AUCs and 95% CIs for the random forest model 1 and model 2 are shown. Model 1 was a random forest model trained on the U.S. cohort with GAM normalization to U.S. controls only, and model 2 was a random forest model trained on the U.S. cohort, with GAM normalization to controls from all three countries. Both models were applied to ESCC and normal tissues from all three countries.
Because of the observed differences in baseline demographic characteristics among the three countries, AUCs were calculated after stratification by age and sex in each country (Supplementary Table S4). We found that these variables generally had no significant impact on model performance, with two exceptions: age stratification led to a significant difference in the performance of model 1 in younger (AUC, 0.95) versus older (AUC, 0.86) Iranians, and sex stratification led to a significant difference in the performance of model 1 in female (AUC, 0.92) versus male (AUC, 0.78) Chinese participants. To evaluate model 2 for overfitting, an alternative approach, which first split the U.S. samples (95 controls and 86 cases) into training and test sets in a 2:1 ratio, was also performed. This approach used the same methods for standardization, but used only those controls who were in the training set (2/3 of the total controls) for the GAM estimation. A random forest model was then developed in the training set and evaluated in the test set. Bootstrap resampling (1,000 samples) was used to help better understand the large-scale performance of this alternative approach. The result (AUC, 0.99; 95% CI, 0.98–1.00) was essentially the same as the result using model 2.
For the exploratory balloon cytology aim, DNA from 93 Chinese balloon cytology samples that passed quality control was analyzed, including 20 samples from patients with normal mucosa, 18 with lymphocytic esophagitis (a prevalent condition in the region), 20 with m-SDYS, 20 with M-SDYS, 10 with S-SDYS, and five with ESCC (Table 1). Levels of three representative MDMs (GRIN2D, TBX15, and ZNF132) in patients with no dysplasia, m-SDYS + M-SDYS, and S-SDYS+ESCC are shown in Fig. 4. The Jonckheere-Terpstra test for ordered differences and a Bonferroni corrected P value cutoff provided statistical evidence that increasing level of dysplasia was associated with significantly increased marker levels for many of the MDMs assayed (Supplementary Table S5).
Box plot distributions of representative individual markers GRIN2D, TBX15, and ZNF132 (A–C, respectively) in DNA extracted from balloon cytology samples. Cases were categorized by the patient's worst histologic diagnosis: no dysplasia, mild/moderate dysplasia, and severe dysplasia/ESCC. The Jonckheere-Terpstra (J-T) test with a Bonferroni-corrected P value cutoff (0.004) was used to assess the significance of the trend between the severity of dysplasia and MDM level. The P values shown are 0.004, <0.001, and <0.001 for GRIN2D, TBX15, and ZNF132, respectively. The “scaled copy number” is the exponentiation of the GAM standardized MDM values divided by the marker's SD.
Box plot distributions of representative individual markers GRIN2D, TBX15, and ZNF132 (A–C, respectively) in DNA extracted from balloon cytology samples. Cases were categorized by the patient's worst histologic diagnosis: no dysplasia, mild/moderate dysplasia, and severe dysplasia/ESCC. The Jonckheere-Terpstra (J-T) test with a Bonferroni-corrected P value cutoff (0.004) was used to assess the significance of the trend between the severity of dysplasia and MDM level. The P values shown are 0.004, <0.001, and <0.001 for GRIN2D, TBX15, and ZNF132, respectively. The “scaled copy number” is the exponentiation of the GAM standardized MDM values divided by the marker's SD.
Discussion
Current endoscopic strategies to screen for ESCC are not realistic for population-level application in resource poor settings due to cost and invasiveness, and are limited by operator dependency and sampling error. A nonendoscopic sampling device coupled with sensitive and specific molecular markers has the potential to overcome these limitations.
In this study, novel MDMs for ESCC discovered in a U.S. population were tested in countries with low and high incidence of ESCC: the United States, Iran, and China. Individual MDMs discriminated ESCC from normal esophagus, and two random forest models were highly discriminant for ESCC in samples from all three countries. In addition, in a pilot study, we showed the feasibility of assaying MDMs in archival DNA from swallowed balloon devices. We also observed a trend of increasing levels of MDMs with increasing severity of dysplasia. Although the sample size of this pilot study was too small to definitively evaluate differences in MDM levels of patients with individual grades of dysplasia, these results suggest that it should be possible to apply MDMs to cytology samples obtained from nonendoscopic esophageal sampling devices, and that MDM results may be potentially useful for nonendoscopic ESCC screening in high-risk populations.
MDMs were chosen in this study as they occur at a predictable site in each gene, and are readily assayed without subjective interpretation. Most of the top MDMs identified in this study have not been previously reported for ESCC detection. The biological function and differential expression of the associated genes are shown in Supplementary Table S3. Notably, most of these genes were involved in either transcriptional regulation or signal transduction, and differential expression was observed in normal esophagus versus ESCC. These findings should be explored further in biological studies to determine their underlying mechanisms and their potential as therapeutic targets.
Agnostic discovery of MDMs for ESCC is novel, and we recently reported the discovery of highly discriminant MDMs for ESCC in 10 cases and nine controls using RRBS, and these MDMs were validated in independent tissues of 35 ESCC and 17 controls (20, 21). A few other studies have sought to evaluate the methylome signature of ESCC using a microarray approach (37, 38). Studies validating the accuracy of these markers are lacking, and there are no previously published methylation data on Iranian ESCCs (39).
In this study, individual MDMs were highly discriminant for ESCC across three countries with varying incidence of ESCC, with TBX15 yielding AUCs of 0.99, 0.93, and 0.93 in the U.S., Iranian, and Chinese ESCC tissues, respectively. A cross-validated model developed from the U.S. cohort (model 1) yielded moderately high AUCs of 0.90 and 0.87, respectively, when applied to Iranian and Chinese ESCC tissues. Model 2, normalized to controls from all three sites, was developed to account for potential differences in baseline MDM levels, either due to variations in genetics and environmental exposures, or due to technical variations in tissue processing (Supplementary Table S2). This model performed better than model 1, with AUCs of 0.96 and 0.94 in Iranian and Chinese tissues, respectively. These results indicate that despite geographic differences and ESCC incidence variation, previously discovered MDMs have the potential to be universally applied.
With a RR of up to 28.3 for ESCC, squamous dysplasia is an ideal target for screening and endoscopic therapy (8). Ours is the first study to explore the feasibility of assaying MDMs in exfoliative cytology specimens from patients with different histologic degrees of dysplasia. In our limited pilot study, we confirmed the feasibility of assaying these markers in DNA from these cell specimens. For many MDMs, we observed a statistically significant trend of increasing marker levels with increasing grades of dysplasia. These results suggest that evaluating MDMs in cytology samples obtained by nonendoscopic esophageal sampling devices may be a potential methodology for detecting precursor dysplasias and early-stage ESCC, a prospect that should be evaluated in larger future studies.
This study had some limitations. Despite being the largest study to date of tissue biomarkers of ESCC in populations with high and low incidence of ESCC, we relied on convenience samples with limited clinical information. In particular, demographic and staging information was not available for all patients due to the retrospective nature of this study. We observed significant differences in baseline demographics among patients from the three countries, and age- and sex-stratified AUCs for each country showed significant impact on two comparisons: age stratification led to a significantly better performance of model 1 in younger versus older Iranians, and sex stratification led to a significantly better performance of model 1 in female versus male Chinese participants (Supplementary Table S4).
Another potential limitation, although all experiments were done in a blinded fashion to minimize bias and all assays were run together to avoid batch effect, was that the baseline collection, fixation, storage, and overall quality of the samples were not uniform (Supplementary Table S2), which could have led to variable results and negatively influenced the performance of model 1. Future prospective studies with standardized methods of sample collection and processing are needed to overcome the limitations of such convenience samples.
Unlike model 1, which was independently developed in one country and then tested in all countries, model 2 was developed with GAM normalization to controls from all three countries to account for the baseline differences above. This could have led to overfitting and inflated performance of this second model. To evaluate this possibility, we split the data into training and test sets as described above. Bootstrap resampling led to identical results as model 2, supporting that its performance was unlikely to have been affected significantly by overfitting. Finally, in our analyses of esophageal exfoliative specimens obtained by swallowed balloon devices, we had too small a sample size to definitively assess measures of diagnostic accuracy. In this pilot study, our aim was only to explore the feasibility of assaying MDMs from archival DNA collected using nonendoscopic esophageal balloons and assess trends in MDM levels with different grades of dysplasia, but not to evaluate the performance of MDMs in terms of sensitivity and specificity. Hence, the Jonckheere-Terpstra test, a nonparametric test to assess statistical significance in trend between an ordinal independent variable (dysplasia) and a continuous dependent variable (MDM level), was used and demonstrated significant increasing trends for many of the MDMs (Supplementary Table S5). Future prospective field studies will be needed to answer the question of performance.
In summary, we have shown that MDMs discovered in U.S. tissues discriminate ESCC from normal esophagus in tissue from geographically distinct populations with varying incidences of ESCC. A pilot study with a limited number of nonendoscopically collected esophageal exfoliative cytology specimens from China also demonstrated the feasibility of assaying MDMs from archival DNA, as well as a trend of increasing MDM levels with increasing grades of dysplasia. These results suggest that MDM assay of esophageal exfoliative cytology specimens may be an effective method for large-scale screening of at-risk populations. Mechanistic studies could further explore the biological role of the genes that harbor these MDMs. Larger prospective studies with uniform standardized protocols are needed to validate the utility of these MDMs in population-based screening programs.
Disclosure of Potential Conflicts of Interest
W. Taylor reports other from Exact Sciences (laboratory funding) during the conduct of the study, other from Exact Sciences (laboratory funding) outside the submitted work, and a patent for detecting esophageal disorders, 10435755, issued to Exact Sciences. X. Cao reports a patent for Exact Sciences licensed and with royalties paid from Mayo Agr.6244 (technology case N:2009-200). D.W. Mahoney reports a patent for 10435755 issued to Exact Sciences, is listed as an inventor on joint intellectual property of Mayo Clinic and Exact Sciences, and may receive royalties in accordance with Mayo Clinic policy. D. Ahlquist reports grants from Exact Sciences (Mayo Clinic and Exact Sciences have a formal collaborative agreement to develop novel cancer detection methods; some funding came from this source) during the conduct of the study; other from Exact Sciences (coinventor on multiple patents licensed to Exact Sciences from Mayo Clinic, and as per Mayo Clinic policy, any potential future royalties would be shared with inventors) outside the submitted work; a patent for detecting esophageal disorders, 10435755, issued and licensed to Exact Sciences (licensed to Exact Sciences by Mayo Clinic); and as part of a formal agreement between Mayo Clinic and Exact Sciences, has served as a scientific advisor to Exact Sciences. Exact Sciences played no role in the concept, design, conduct, or data interpretation of this study. J.B. Kisiel reports grants and other from Exact Sciences (royalties) during the conduct of the study; grants and other from Exact Sciences (royalties) outside the submitted work; as well as a patent for 10435755, detecting esophageal disorders, issued and licensed to Exact Sciences. P.G. Iyer reports grants from Exact Sciences (research funding) during the conduct of the study, as well as grants from Pentax Medical, grants and other from Medtronic (consulting), and other from CSA Medical (consulting) and Symple Surgical (consulting) outside the submitted work. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Y. Qin: Conceptualization, data curation, formal analysis, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. W. Taylor: Data curation. W.R. Bamlet: Data curation. A. Ravindran: Data curation. A. Buglioni: Data curation. X. Cao: Data curation. P.H. Foote: Data curation. S.W. Slettedahl: Data curation. D.W. Mahoney: Data curation. P.S. Albert: Data curation. S. Kim: Data curation. N. Hu: Data curation. P.R. Taylor: Data curation. A. Etemadi: Data curation. M. Sotoudeh: Data curation. R. Malekzadeh: Data curation. C.C. Abnet: Data curation. T.C. Smyrk: Data curation. D. Katzka: Data curation. M.D. Topazian: Data curation. S.M. Dawsey: Data curation. D. Ahlquist: Data curation. J.B. Kisiel: Data curation. P.G. Iyer: Writing–review and editing.
Acknowledgments
This work was partially supported by the Maxine and Jack Zarrow Family Foundation of Tulsa Oklahoma (to J.B. Kisiel), R37CA214679 (to J.B. Kisiel), and R01CA241164 (to P.G. Iyer and J.B. Kisiel). This work was also funded, in part, by the Intramural Research Program of the NCI. Sequencing costs and MSP assays were provided by Exact Sciences.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.