Abstract
Purpose: Current tumor–node–metastasis (TNM) staging system cannot provide adequate information for prediction of prognosis and chemotherapeutic benefits. We constructed a classifier to predict prognosis and identify a subset of patients who can benefit from adjuvant chemotherapy.
Experimental Design: We detected expression of 15 immunohistochemistry (IHC) features in tumors from 251 gastric cancer (GC) patients and evaluated the association of their expression level with overall survival (OS) and disease-free survival (DFS). Then, integrating multiple clinicopathologic features and IHC features, we used support vector machine (SVM)–based methods to develop a prognostic classifier (GC-SVM classifier) with features. Further validation of the GC-SVM classifier was performed in two validation cohorts of 535 patients.
Results: The GC-SVM classifier integrated patient sex, carcinoembryonic antigen, lymph node metastasis, and the protein expression level of eight features, including CD3invasive margin (IM), CD3center of tumor (CT), CD8IM, CD45ROCT, CD57IM, CD66bIM, CD68CT, and CD34. Significant differences were found between the high- and low-GC-SVM patients in 5-year OS and DFS in training and validation cohorts. Multivariate analysis revealed that the GC-SVM classifier was an independent prognostic factor. The classifier had higher predictive accuracy for OS and DFS than TNM stage and can complement the prognostic value of the TNM staging system. Further analysis revealed that stage II and III GC patients with high-GC-SVM were likely to benefit from adjuvant chemotherapy.
Conclusions: The newly developed GC-SVM classifier was a powerful predictor of OS and DFS. Moreover, the GC-SVM classifier could predict which patients with stage II and III GC benefit from adjuvant chemotherapy. Clin Cancer Res; 24(22); 5574–84. ©2018 AACR.
This article is featured in Highlights of This Issue, p. 5491
Tumor–node–metastasis (TNM) staging system of gastric cancer (GC) is not adequate for definition of prognosis and cannot predict the candidates who are likely to benefit from chemotherapy. In this research, we constructed an SVM-based GC prognostic classifier (GC-SVM) integrating 3 clinicopathologic features and 8 immunohistochemistry features in the training cohort of 251 patients. And further validation of the GC-SVM classifier was performed in 2 validation cohorts of 535 patients. Multivariate analysis revealed that the GC-SVM classifier was an independent prognostic factor. Furthermore, the classifier had higher predictive accuracy for overall survival and disease-free survival than TNM stage and can add prognostic value to the TNM staging system. Moreover, the GC-SVM classifier might be able to predict which patients will benefit from adjuvant chemotherapy. Thus, the classifier could facilitate patient counseling and individualized management.
Introduction
Gastric cancer (GC) is one of the most common malignancies and the second leading cause of cancer-related deaths worldwide (1). Surgical resection is the main curative method for GC, but a high rate of relapse in patients with advanced GC makes it important to consider adjuvant treatments (2, 3). Currently, the tumor–node–metastasis (TNM) staging system and histologic classification are used for routine prognostication and treatment among patients with GC, but neither provides substantial predictive value (2–4). Given that GC patients with the same clinical stage and similar treatment regimens often undergo substantially different clinical courses, a new GC classification system is needed for more precise prediction of prognosis, thus enabling a more tailored therapeutic approach with improved outcomes for GC patients.
Extensive studies have suggested tumor-infiltrating immune cells and tumor angiogenesis in cancers were correlated with prognosis (5–10). Galon and colleagues showed the type, density, and location of immune cells in colorectal cancers had a prognostic value that was superior to and independent of those of the TNM stage (11–13). Based on the numeration of lymphocyte and/or myeloid cell populations in the center of tumor (CT) and the invasive margin (IM), immune score could predict survival and/or treatment response (5–7, 11, 13–18). An immunoscore of colon cancer, derived from a measure of CD3-positive and CD8-positive cell densities in the CT and IM, had a larger relative prognostic value than pT stage, pN stage, lympho-vascular invasion, tumor differentiation, and microsatellite instability (MSI) status (16). Two immunoscores of GC also showed great prognostic values, which were derived from 5 features (CD3 IM, CD3 CT, CD8IM, CD45RO CT, and CD66b IM) and 11 types of immune cell fraction, respectively (17, 18). Our previous studies also revealed that tumor-infiltrating lymphocytes, myeloid cells, and angiogenesis play critical roles in GC progression, and combining multiple immune biomarkers would substantially improve the prognostic value (10, 17, 19). Furthermore, a survival prediction model based on specific tumor and patient clinicopathologic characteristics could be used to predict survival benefit from adjuvant chemotherapy for patients with stage II or stage III GC (20). On the basis of these findings, we hypothesize that combining multiple clinicopathologic features and immunomarkers of immune cells and angiogenesis in tumor can improve overall prediction of GC outcome.
Recently, several supervised learning methods, such as decision trees, have been applied to the analysis of cDNA or tissue microarrays to refine prognosis in breast cancer, nasopharyngeal carcinoma, and non–small cell lung cancer (21). State-of-the-art classification algorithms such as support vector machines (SVM) can be used to select a small subset of highly discriminating markers and patients or disease attributes to build reliable cancer classifiers (22, 23).
Therefore, in this study, integrating multiple clinicopathologic features and immunomarkers, we developed an SVM-based GC prognostic classifier (GC-SVM) to predict overall survival (OS) and disease-free survival (DFS), and explored whether GC-SVM classifier could identify the patients with stage II and III GC who might benefit more from postoperative adjuvant chemotherapy.
Materials and Methods
Patients and tissue samples
The study enrolled three independent cohorts of patients with GC. The training cohort and internal validation cohort that comprised 251 consecutive patients and 248 consecutive patients with total or partial radical gastrectomy were obtained from Nanfang Hospital of Southern Medical University (Guangzhou, China) between January 2005 and August 2007, September 2007 and August 2009, respectively. The external validation cohort comprising 287 consecutive patients was obtained from the First Affiliated Hospital of Sun Yat-sen University (SYSU) between January 2005 and December 2007 with same enrollment criteria. Use of human tissues with informed consent from patients was approved by the Clinical Research Ethics Committee of each hospital. Clinical baseline data were retrospectively collected for each patient. The clinical sources of the 786 patients with GC are listed in Table 1. Inclusion criteria were availability of hematoxylin and eosin slides with invasive tumor components, availability of follow-up data and clinicopathologic characteristics, no history of cancer treatment, and appropriate patient-informed consent. We excluded patients if formalin-fixed paraffin-embedded (FFPE) tumor (CT and IM) and normal samples from the initial diagnosis were unavailable or if they had received previous treatment with any anticancer therapy. Two independent pathologists reassessed all these samples. This study was conducted in accordance with the Declaration of Helsinki. Written-informed consent was obtained from all patients, and this study was approved by the Review Boards at Nanfang Hospital of Southern Medical University and the First Affiliated Hospital of SYSU. Data were analyzed from July 21, 2017, to December 2, 2017.
Baseline information for each patients with GC, including age, gender, American Society of Anesthesiologists score, Eastern Cooperative Oncology performance status, Charlson comorbidity index, tumor location, tumor size, differentiation, Lauren type (17), carcinoembryonic antigen (CEA), cancer antigen 19-9 (CA19-9), TNM staging at surgery, postsurgical chemotherapy, and follow-up data (follow-up duration and survival), was documented. The TNM staging was reclassified according to the seventh edition of the American Joint Committee on Cancer (AJCC) Cancer Staging Manual of the American Joint Committee on Cancer/International Union Against Cancer (24). The tumor size was defined according to the longest diameters of the samples. Patients diagnosed at advanced stage or early-stage tumors that have excessive lymph node metastasis were candidates for receiving postsurgical chemotherapy. Collectively, there were 106 (40.5%), 157 (63.3%), and 138 (48.1%) patients who received 5-fluorouracil–based postsurgical chemotherapy in the 3 cohorts, respectively. Of the 401 patients treated with postoperative chemotherapy, 126 (31.4%) patients received the XELOX (capecitabine–oxaliplatin) regimen, 271 (67.6%) patients received the FOLFOX (fluorouracil–folinic acid–oxaliplatin) regimen, and only 4 (1.0%) patients received 5-FU treatment alone (Supplementary Table S1). Follow-up data were collected from hospital records for patients who were lost to follow-up. The follow-up duration was measured from the time of surgery to the last follow-up date, and information regarding the survival status at the last follow-up was collected.
Immunohistochemistry and image analysis
On the basis of previous study findings, we selected eight molecular markers involved in different aspects of GC development and metastasis in the present study, including seven immune cell biomarkers [CD3 (pan T cells), CD8 (cytotoxic T cells), CD45RO (memory T cells), CD45RA (naïve T cells), CD57 (natural killer cells), CD68 (macrophages), CD66b (neutrophils)] and a microvascular marker (CD34). FFPE samples were cut into 4-μm sections, which were then processed for immunohistochemistry (IHC) as previously described (10, 17). Detailed information was provided in the Supplementary Materials. Every staining run contained a slide treated with PBS buffer in place of the primary antibody as a negative control (25, 26). Slide sections of lymph nodes were employed as positive control for immune cell staining. Slide sections of GC tumor tissues that were previously verified with CD34 overexpression were chosen as positive controls for CD34 staining. Every staining run contained a slide of positive control. Prior to staining, the sections were subjected to endogenous peroxidase blocking in 1% H2O2 solution diluted in methanol for 10 minutes and then heated in a microwave for 30 minutes with 10 mmol/L citrate buffer (pH 6.0). Serum blocking was performed using 10% normal rabbit serum for 30 minutes. Furthermore, all slides were stained with the same concentrations of primary antibody for each antibody and incubated with monoclonal primary antibody overnight at 4°C. The reaction was visualized using diaminobenzidine (DAB) + chromogen, and nucleus was counterstained using hematoxylin. And all slides were stained with DAB dyeing for the same time for each antibody (Supplementary Table S2).
The IHC results were evaluated by two independent gastroenterology pathologists who were blinded to the clinical data. At low power (100), the tissue sections were screened using an inverted research microscope (model DM IRB; Leica Germany), and the five most representative fields were selected. Thereafter, to evaluate the density of stained immune cells, the two respective areas of CT and IM were measured at 200× magnification. The nucleated stained cells in each area were quantified and expressed as the number of cells per field. For immune cells and microvessels, the staining intensity was evaluated by two independent gastroenterology pathologists with great experience. The pathologists determined whether each immune cell or microvessel was positive or not (10, 17, 19, 27, 28). The microvessel density (MVD) in GC tumor tissues was evaluated by staining for CD34. Any discrete cluster or single cell stained for CD34 was counted as one microvessel (10, 27). Five representative fields were quantified, and the average number of microvessels per field (×200) was presented as the MVD. Two pathologists independently scored all samples blindly with regard to clinical characteristics and prognosis. Their results were in complete agreement in 90% of the cases. A third pathologist was consulted when different opinions arose between the two primary pathologists. If the third pathologist agreed with one of them, then that value was selected. If the conclusion by the third pathologist was completely different, then the three of them would work collaboratively to find a common answer. We selected the optimum cutoff score for every feature using X-tile software version 3.6.1 (Yale University School of Medicine, New Haven, CT) based on the association with the patients' OS.
Prognosis prediction using SVM-based methods
SVM was introduced by Vapnik (29) for data classification and function approximation. An SVM is a binary classifier trained on a set of labeled patterns called training samples. The purpose of training an SVM is to find a hyperplane that divides these samples into two sides so that all the points with the same label will be on the same side of the hyperplane (22, 29–32). In this study, SVM was used to predict whether a patient died within 5 years. We adopted the SVM-recursive feature elimination algorithm to select and rank useful features (22). To investigate the possibility of identifying different prognostic subsets of patients based on their clinicopathologic features and immunomarkers using SVM, we performed a set of experiments in the training cohort of 251 patients; the developed GC-SVM classifier was further validated in 535 patients from 2 independent cohorts. In the training cohort, patients on the side of the hyperplane who had better survival were classified into high GC-SVM group. The SVM data processing methods were conducted as previously described (22, 31, 32). The programs were coded using R software; scripts are available on request.
Statistical analysis
We compared two groups using the t test for continuous variables and χ2 test for categorical variables. Survival curves were depicted according to the Kaplan–Meier method and compared using the log-rank test. In univariable analysis, survival curves for different variable values were generated using the Kaplan–Meier method and were compared using the log-rank test. Variables that reached significance with P < 0.05 were entered into the multivariable analyses using the Cox regression model. Interactions between the classifier and treatment were detected by mean of the Cox model as well. All statistical analyses were performed using R software (version 3.0.1) and SPSS software (version 19.0). Statistical significance was set at 2-sided P < 0.05.
Results
Patient characteristics and components of the GC-SVM classifier
The clinicopathologic characteristics for the training cohort (n = 251), internal (n = 248), and external (n = 287) validation cohorts were listed in Table 1. Of the 786 patients included in the study, 515 (65.5%) were men, and the median [interquartile range (IQR)] age of all patients was 57 (49–65) years. In the training cohort, the median (IQR) survival time for DFS and OS were 26 (7–73) and 36 (15–76) months, respectively. In the internal and external validation cohorts, the median (IQR) survival time for DFS and OS were 32 (9–75) and 51 (15–75) months, 33 (11–80) and 40 (15–82) months, respectively.
We used X-tile plots to generate the optimum cutoff score for all 15 IHC features in the training cohort, and Supplementary Table S3 showed the results of the univariate analysis between each of the 15 features and the survival in the training cohort. On the basis of the SVM analysis of the training data, the GC-SVM classifier integrated sex, CEA, lymph node metastasis, and expression of eight features, including CD3IM, CD3CT, CD8IM, CD45ROCT, CD57IM, CD68CT, CD66bIM, and CD34, as critical factors. Univariate associations of the GC-SVM classifier, clinicopathologic parameters, and the expression of each of the evaluated features with OS and DFS in the training and validation cohorts were shown in Supplementary Tables S3 to S5. Representative IHC staining of all the features in tumor tissues was shown in Supplementary Fig. S1.
The ROC curves for traditional clinicopathologic prognostic factors, including age, sex, tumor size, differentiation, lauren type, CEA, CA19-9, and TNM stage as well as each immune features and the overall GC-SVM classifier illustrated the point with the maximum area under the curve (AUC) for each factor. In the three cohorts, the AUCs of the GC-SVM classifier for 5-year OS and DFS (training cohort: 0.796, 0.805; internal validation cohort: 0.809, 0.813; external validation cohort: 0.834, 0.828; respectively; Supplementary Fig. S2; Supplementary Tables S6 and S7) were significantly greater than the AUCs for all other prognostic factors considered (next largest AUC for TNM stage, training cohort: 0.649, 0.659; internal validation cohort: 0.746, 0.678; external validation cohort: 0.745, 0.737, respectively). Supplementary Table S8 lists the relationships between the GC-SVM classifier and clinicopathologic characteristics in the training, internal, and external validation cohorts. The GC-SVM classifier was significantly associated with TNM stage, tumor size, CA199, and lymph node metastasis of GC (Supplementary Table S8).
GC-SVM classifier and GC survival
In the training cohort, we defined 172 patients as low GC-SVM and 79 patients as high GC-SVM. The 5-year OS and DFS were 15.7% and 10.5%, respectively, for the low-GC-SVM patients, and 78.5% and 68.4%, respectively, for the high-GC-SVM patients [HR 0.113 (0.065–0.197) and 0.138 (0.084–0.224), respectively; all P < 0.0001; Fig. 1A]. We performed the same analyses in the internal validation cohort. The 5-year OS and DFS were 19.0% and 15.3%, respectively, for the low-GC-SVM patients, and 82.0% and 77.5%, respectively, for the high-GC-SVM patients [HR 0.166 (0.110–0.250) and 0.186 (0.126–0.274), respectively; all P < 0.0001; Fig. 1B]. To confirm that the GC-SVM classifier had an excellent prognostic value in different populations, we further applied it to the external validation cohorts and found similar results (Fig. 1C). The GC-SVM classifier also remained a clinically and statistically significant predictor of prognosis after stratification by clinicopathologic factors (Supplementary Figs. S3–S6).
In univariable analysis, low-GC-SVM patients were associated with significantly poorer OS and DFS (Supplementary Tables S3–S5). Variables demonstrating a significant effect on OS and DFS were included in the multivariable analysis. Multivariate Cox regression analysis after adjustment for clinicopathologic variables and TNM stage revealed that the GC-SVM classifier remained a powerful and independent prognostic factor for OS and DFS in the training, internal, and external validation cohorts (Table 2).
We performed stratified analyses of GC patients with stage I, II, III, and IV disease in the combined internal cohort and external validation cohort. High-GC-SVM patients with stage I, II, III, or IV disease had a longer OS and DFS than patients with low-GC-SVM did both in internal and external cohorts (Fig. 2). Furthermore, the GC-SVM classifier exhibited a higher prognostic accuracy than TNM stage, any clinicopathologic risk factor, or single IHC feature alone (Supplementary Fig. S2; Supplementary Tables S6 and S7).
GC-SVM classifier and adjuvant chemotherapy benefit
Furthermore, we investigated whether low- or high-GC-SVM patients with stage II or III GC could benefit from postoperative adjuvant chemotherapy. A test for an interaction between GC-SVM and adjuvant chemotherapy indicated that, either in stage II or III disease, the benefit from adjuvant chemotherapy was superior among patients with high GC-SVM [stage II: OS, HR 0.156 (0.044–0.554), 0.004; DFS, 0.280 (0.106-0.741), 0.010; stage III: OS, HR 0.472 (0.242–0.919), 0.027; DFS, 0.448 (0.240–0.836), 0.012; all P < 0.0001 for interaction; Table 3] than among those with low GC-SVM. The corresponding Kaplan–Meier survival curves for patients with stage II or stage III disease, which comprehensively compared low with high GC-SVM by treatment, are shown in Fig. 3. The results from the subset analysis using GC-SVM classifier revealed that adjuvant chemotherapy significantly increased OS and DFS in the high-GC-SVM group (stage II: P = 0.001 and P = 0.006; stage III: P = 0.023 and P = 0.009, respectively), but had no significant effect in the low-GC-SVM group (stage II: P = 0.139 and P = 0.395; stage III: P = 0.347 and P = 0.394, respectively; Fig. 3). Consequently, these results suggest that stage II and III patients with high GC-SVM could benefit from adjuvant chemotherapy.
Discussion
Prognostic assessment is crucial for formation of appropriate treatment choices. Because GC is a clinically heterogeneous disease, with large variations in the clinical outcomes even among patients with the same stage (33, 34), we sought to improve the prediction of GC prognosis by developing a novel 11-feature GC-SVM classifier to categorize patients into low- and high-GC-SVM groups with large differences in 5-year OS and DFS. Cox regression analysis showed the GC-SVM classifier was an independent prognostic factor for OS and DFS, even after adjustment for TNM stage and clinicopathologic characteristics. In addition, ROC analysis suggested that the survival predictive ability of GC-SVM classifier was better than TNM stage and clinicopathologic characteristics. Moreover, in stratified analyses with TNM stage, GC-SVM classifier can distinguish each stage patients into low- and high-risk groups with significant differences in OS and DFS in the internal and external cohorts, supporting the prognostic value of the classifier and allowing clinicians to potentially identify candidates for systemic approaches with greater effectiveness to improve treatment outcomes. Thus, the GC-SVM classifier can add prognostic value to TNM staging system. Therefore, the GC-SVM classifier provides clinicians with a valid and reliable tool for better prediction of GC prognosis. Ultimately, patients classified with the same TNM stage might be able to be stratified into different risk groups on the basis of the GC-SVM classifier, and thus treated with systemic approaches of different intensities to improve outcomes.
Adjuvant chemotherapy has been recommended as a standard component of therapies for patients with stage II and III GC and improves their outcomes (2, 3). However, not everyone could benefit from adjuvant chemotherapy and the criterion for the selection of candidates is still controversial (2, 3, 20, 33). It is important to identify patients whose tumor will not only be sensitive to chemotherapy, but also have overall better outcomes which would prevent excessive toxicities. Assignment of treatment based in part on tumor molecular characteristics is an increasingly promising approach (17, 23, 35). Previous studies have shown that tumor-infiltrated immune cells were associated with chemotherapeutic response in other types of cancer (6, 36–39). In this study, we assessed the association between GC-SVM classifier and clinical outcomes in stage II and III patients receiving adjuvant chemotherapy. The results suggested that patients with high GC-SVM were easier to obtain a better survival benefit from adjuvant chemotherapy compared with those with low GC-SVM, indicating that GC-SVM classifier could be an important factor for predicting the efficiency of chemotherapy. This will be useful for better selection and management of patients who would receive adjuvant chemotherapy.
Substantial efforts have been made toward identification of molecular signatures to predict survival in GC patients, including gene signatures, miRNAs, and epigenetic biomarkers (18, 40–42). However, these gene-based signatures have not been widely introduced into clinical practice as initially expected due to the variability of measurements in microarray assays, inconsistencies in assay platforms, and the requirement for analytical expertise (41, 43, 44). Cheong and colleagues described a predictive test for prognosis and response to adjuvant chemotherapy in patients with localized, resectable GC (40). The test is based on a four-gene real-time RT-PCR assay, which measures gene-expression levels in FFPE tumor tissues, but use of FFPE tumor tissues may decrease the reliability of RNA quantitation. IHC not only provides a semiquantitative assessment of protein abundance but also defines the cellular localization of their expression (40). In this case, identification of immunobiomarkers with IHC, which has been widely applied in clinical diagnosis, is found to serve as promising alternative strategies for the molecular profiling of tumors (31, 45).
In recent years, immune profiling studies have reached a forefront position in research of solid tumors, including GC. Several studies showed that tumor-infiltrating CD3+, CD4+, CD8+, CD57+, CD45RO+, and CD45RA+ cells were associated with better survival in patients with GC, whereas CD66b+ and CD68+ cells were associated with significantly a worse outcome in patients with GC (8, 10, 17, 19, 46). A recent meta-analysis summarized the impact of immune cells, including B cells, natural killer (NK) cells, myeloid-derived suppressor cells, macrophages, and all subsets of T cells on clinical outcome from more than 120 published articles (9). Importantly, the beneficial impact of the immune infiltration with cytotoxic and memory T-cell phenotypes has been demonstrated in cancers of diverse anatomical sites, including not only GC but also malignant melanoma, lung, colorectal, esophageal, breast, and bladder cancers (9). Our previous studies also showed that tumor-infiltrating NK cell predicted a good prognosis and infiltrated neutrophils were inversely correlated with survival in patients with GC (10, 17, 19), which was consistent with the results of other studies (8, 18). In this study, the GC-SVM classifier, including eight features (CD3 IM, CD3 CT, CD8 IM, CD45RO CT, CD57 IM, CD68 CT, CD66b IM, and CD34), could effectively predict survival and complemented the prognostic value of the TNM staging system.
Tumor progression is the product of evolving crosstalk between malignant cells and various stromal and immune cell subsets of the surrounding microenvironment (47, 48). During this process, the tumor cells interact with their microenvironment, which is complex and composed of stromal and immune cells that penetrate the tumor site via blood vessels and lymphoid capillaries (49). All subsets of immune cells can be found in tumors, but their respective density, functionality, and organization may vary in different spatial locations of tumor (10, 28, 49–51). It is known that immune cells are scattered in the CT within the tumor stroma and the tumor glands, in the IM, and in organized lymphoid follicles distant from the tumor. A statistically significant correlation between immune cell density in either tumor region (CT or IM) and patient outcome has been shown in gastric and colorectal cancers (10–12, 17). Given the major clinical importance of distinct tumor regions, it is appropriate to conduct immune cell infiltration evaluation systematically in the two separate areas, the CT and the IM (11, 12, 17). After ROC curve–based optimization, although the AUC values of some markers in the IM and CT were similar in this study, most markers in IM and CT still had relatively large differences of predictive abilities, such as CD66b (AUC values for 5-year death: IM, 0.657; CT, 0.582; in the training cohort) and CD45RO (AUC values for 5-year death: IM, 0.620; CT, 0.654; in the training cohort). Besides, Tumeh and colleagues showed that preexisting CD8+ T cells distinctly located at the invasive tumor margin were associated with the expression of the PD-1/PD-L1 immune inhibitory axis and might predict response to therapy (52). Tumor-infiltrating lymphocyte densities at the IM of liver metastases could predict response to chemotherapy in metastatic colorectal cancer (36). Our previous studies also found that the infiltrating neutrophils in the IM of tumor are much more than those in the CT and have a higher prognostic value (10). Thus, it could be considered that there are biological significances of the spatial information and further research is needed.
Compared with other machine learning algorithms, SVM is better suited to manage classification based on high-dimensional data with a limited number of training samples to select the most efficient of all available features (29, 30). Previous studies have shown that single biomarker has limited prognostic value for GC (8, 17, 23). To improve the prognostic predictive value of individual markers, SVM can combine clinicopathologic features with independently informative markers to predict disease outcome. Our GC-SVM classifier integrates patient sex, CEA, lymph node metastasis along with eight features, including CD3IM, CD3CT, CD8IM, CD45ROCT, CD57IM, CD68CT, CD66bIM, and CD34. Furthermore, the GC-SVM classifier was substantially more strongly associated with OS and DFS than any individual component. We adopted the SVM-recursive feature elimination algorithm to select the features and developed the GC-SVM classifier, including 11 features. When integrating these 11 features, the accuracy of the model was the highest in the training cohort. The SVM model also accounted for the interactions of all features (29, 30, 53). Though not all these features had the highest prognostic value, the accuracy of the model was the best. However, the underlying biological reasons are not very clear, and further research should explore the potential biological reasons. Compared with the immunoscores and other prognostic model of GC reported by previous studies (11, 12, 17, 54, 55), our GC-SVM classifier comprehensively integrated information of tumor-infiltrating lymphocyte features, myeloid cell features, and clinicopathologic features using SVM algorithms that could significantly improve its predictive accuracy. Thus, our results indicated that the GC-SVM classifier was able to select the most informative factors that contributed independently and collectively to the prediction of prognosis.
In The Cancer Genome Atlas project, GC was divided into four subtypes based on the molecular classification: Epstein–Barr virus (EBV)-positive, MSI, genomically stable, and chromosomal instability (56). Subsequently, various hypotheses were proposed to describe the impact of molecular processes on the intensity and nature of the host immune response (57, 58). Immune cells of many different types are frequently observed in primary GC, and certain molecular subtypes of GC, such as with EBV-positive and with MSI, are well known to be associated with a high lymphocytic infiltrate (57–59). About 10% of the tumors are EBV-positive, which display recurrent PIK3CA mutations, extreme DNA hypermethylation, and amplification of JAK2, PD-L1, and PD-L2 (56). It is noteworthy that aberrant methylation can be induced by infectious agents such as Helicobacter pylori or EBV infection (42, 60, 61). According to a phase II study, mismatch repair deficiency renders different solid tumors highly sensitive to immune checkpoint blockade with the PD-1 inhibitor pembrolizumab, and these tumors contain prominent immune infiltrates (62, 63). Therefore, future studies should investigate the association between these immunobiomarkers included in the classifier and molecular classification based on GC causes, and explore whether the classifier can predict the responses of patients with GC to immunotherapy.
Our study has several limitations. First, it was retrospective in nature and all specimens were obtained from patients in southern China. The full chemotherapy details might not be available for the entire cohorts. Therefore, our results need a prospective, larger, multicentered randomized trial to validate. Furthermore, the mechanism for the predictive value of multifeature classifier predicting is not very clear, and further investigation may provide more information for better understanding of the roles of these features in the development and progression of GC and provide additional information and strategies for treatment (38, 39, 64, 65). Another limitation is that the constrained number of biomarkers screened in the training cohort, which in turn resulted in a smaller panel of features integrated into the GC-SVM classifier than in some gene expression profiling studies by cDNA array (40–42). Although the GC-SVM classifier was a highly accurate predictor of OS and DFS, we are aware that other biomarkers may extend the precision and predictive value of the classifier, and new markers are being found and new techniques developed every year. Thus, the GC-SVM classifier may be further improved by including additional markers.
In conclusion, the study demonstrates that the GC-SVM classifier can accurately distinguish GC patients with substantially different OS and DFS. Furthermore, the GC-SVM classifier could identify a subgroup of patients with stage II and III disease who could benefit from adjuvant chemotherapy. Thus, the GC-SVM classifier might facilitate patient counseling, decision-making regarding individualized therapy, and follow-up scheduling.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: L. Huang, Y. Hu, Q. Zhang, T. Li, S. Cai, G. Li
Development of methodology: J. Xie, W. Liu, L. Huang, T. Li, G. Li
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): Y. Jiang, J. Xie, Z. Han, W. Liu, S. Xi, L. Huang, W. Huang, T. Lin, L. Zhao, Y. Hu, J. Yu, T. Li, S. Cai, G. Li
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): Y. Jiang, J. Xie, Z. Han, W. Liu, S. Xi, L. Huang, W. Huang, L. Zhao, Y. Hu, J. Yu, Q. Zhang, S. Cai, G. Li
Writing, review, and/or revision of the manuscript: Y. Jiang, J. Xie, Z. Han, S. Xi, L. Huang, W. Huang, L. Zhao, Y. Hu, J. Yu, Q. Zhang, T. Li, S. Cai, G. Li
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): Y. Jiang, J. Xie, Z. Han, W. Liu, S. Xi, L. Huang, W. Huang, T. Lin, L. Zhao, Y. Hu, J. Yu, Q. Zhang, T. Li, S. Cai, G. Li
Study supervision: Q. Zhang, T. Li, S. Cai, G. Li
Acknowledgments
This work was supported by grants from the National Natural Science Foundation of China (81672446, 81600510, 81370575, and 81570593), Key Clinical Specialty Discipline Construction Program (2017YFC0108300), the National Key Research and Development Program of China (2017YFC0108300), Natural Science Foundation of Guangdong Province (2014A030313131), Science and Technology Planning Project of Guangzhou (2014B020228003, 2014B030301041, and 2015A030312013), and Director's Foundation of Nanfang Hospital (2016B010).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.