Purpose: The dismal outcome of hepatocellular carcinoma (HCC) is largely attributed to its early recurrence and venous metastases. We aimed to develop a metastasis-related model to predict hepatocellular carcinoma prognosis.

Experimental Design: Using microarrays, sequencing, and RT-PCR, we measured the expression of mRNAs and lncRNAs in a training set of 94 well-defined low-risk (LRM) and high-risk metastatic (HRM) HCC patients from a Shanghai cohort. We refined a metastasis signature and established a corresponding model using logistic regression analysis. The validation set consisted of 567 HCC patients from four-center cohorts. Survival analysis was performed according to the metastasis model.

Results: Using relative expression of tumor to para-tumor tissues, we refined the metastasis signature of five mRNAs and one lncRNA. A generalized linear model was further established to predict the probability of metastasis (MP). Using MP cutoff of 0.7 to separate LRM and HRM in Shanghai cohort, the specificity and sensitivity of the model were 96% [95% confidence interval (CI), 85%–99%] and 74% (95% CI, 58%–86%), respectively. Furthermore, HRM patients showed a significantly shorter overall and recurrence-free survival in validation cohorts (P < 0.05 for each cohort). Early HCC patients also have a poorer outcome for multicenter HRM patients. Finally, Cox regression analysis indicated that continuous MP was an independent risk factor and associated with the recurrence and survival of HCC patients after resection (HR 2.98–16.6, P < 0.05).

Conclusions: We developed an applicable six-gene metastasis signature, which is robust and reproducible in multicenter cohorts for HCC prognosis. Clin Cancer Res; 23(1); 289–97. ©2016 AACR.

Translational Relevance

Postsurgical recurrence or intrahepatic metastasis is responsible for the most of cancer-related deaths in hepatocellular carcinoma (HCC). In order to evaluate HCC metastasis and prognosis, we identified a novel signature of five mRNAs and one lncRNA in a cross-platform and unbiased manner. A generalized linear model was further developed to predict the probability of metastasis which showed a significant prognostic value. The predicted probability of metastasis was highly associated with overall and recurrence-free survival in multicenter cohorts. The metastasis model could also be used for HCC prognosis of patients with early recurrence or those in early stage of Barcelona-Clinic Liver Cancer (BCLC). Moreover, the metastasis signature was identified as a survival-related factor independent of current clinical systems, for example, BCLC staging system. Our six-gene metastasis signature is robust and applicable to stratify HCC patients who may benefit from personalized adjuvant therapy.

Liver cancer, predominantly hepatocellular carcinoma (HCC), is the seventh most common cancer worldwide (1). As the third most frequent cause of cancer-related death, HCC has a recurrence rate of more than 70% and around 80% cancer-related mortality within 5 years after surgery (2, 3). The poor outcome of HCC is in large part due to the lack of timely diagnosis and frequent intrahepatic metastasis. Indeed, vascular invasion as a representative of intrahepatic metastasis is a major cause of tumor recurrence within 2 years after resection (4). Currently, it is urgent to develop an efficient and reliable assessment of metastatic risk for predicting clinical outcome of HCC patients who may benefit from personalized adjuvant treatment.

With the development of high-throughput microarray and RNA sequencing, an increasing number of publications is related to HCC prognosis. A 164-gene signature has been reported to predict the clinical behavior of metastatic HCC patients (5). Another study established a five-gene score to predict HCC survival after liver resection (6). Nevertheless, the prediction is still improvable by including new types of genes, for example, long noncoding RNAs (lncRNAs). Recently, lncRNAs have been revealed to associate with metastasis and the prognosis of HCC patients (7, 8). Therefore, it is necessary to explore new molecular signatures through cross-platform profiling and systematic analysis of lncRNAs and mRNAs.

Here, we carried out a retrospective study to stratify multicenter HCC patients into different risks of metastasis for predicting clinical outcome of metastasis, recurrence, and survival. We globally profiled the relative expression of mRNAs and lncRNAs in Shanghai cohort to refine the potential gene signature. By RT-PCR, we confirmed its expression pattern, developed and then validated a metastasis model. The model could separate HRM from LRM and successfully predict HCC prognosis in the training and validation sets. According to the metastasis model, HRM patients showed strongly poor overall survival (OS) and recurrence-free survival (RFS) in Fujian, Guangxi, LCI-Sh, and Korean cohorts. Survival analysis of a subgroup of HCC patients with a single nodule or at stage A of Barcelona-Clinic Liver Cancer (BCLC) also indicated that HRM patients had a significantly shorter survival. Moreover, the predicted probability of metastasis (MP) was a survival-related risk factor independent of well-known clinical indicators or staging systems.

Patients and samples

The study design (Fig. 1) and data analysis followed the reporting recommendations for tumor marker prognostic studies (REMARK) and the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD; refs. 9, 10). HCC patients were pathologically diagnosed by two experienced pathologists who did not participate in data analysis. The inclusion criteria of primary HCC patients were as follows: (i) patients did not receive any anticancer treatment before surgery; and (ii) the complete resection of all tumor nodules was pathologically verified by the cutting surface. However, HCC patients were excluded under the following criteria: (i) tumors had more than 80% necrosis or insufficient amount of RNA; (ii) patients who died within the first month following surgery because of surgical complications and/or decompensated cirrhosis; and (iii) patients who died of nonliver diseases or accidents.

Figure 1.

Flow chart of the study. The study was performed in multicenter cohorts including EHBH-Sh, Fujian (Fj), Guangxi (Gx), LCI-Sh, and Korea (Ko). The samples from EHBH-Sh were profiled through microarray and RNA sequencing (RNA-seq), and used to identify potential metastasis signature. Another batch of EHBH-Sh samples were measured by RT-PCR to confirm the expression of potential metastasis signature and establish a metastasis model. The prognosis analysis was performed to validate the metastasis model using independent samples from Fj, Gx, LCI-Sh, and Ko cohorts. The number of selected patients is shown in each box.

Figure 1.

Flow chart of the study. The study was performed in multicenter cohorts including EHBH-Sh, Fujian (Fj), Guangxi (Gx), LCI-Sh, and Korea (Ko). The samples from EHBH-Sh were profiled through microarray and RNA sequencing (RNA-seq), and used to identify potential metastasis signature. Another batch of EHBH-Sh samples were measured by RT-PCR to confirm the expression of potential metastasis signature and establish a metastasis model. The prognosis analysis was performed to validate the metastasis model using independent samples from Fj, Gx, LCI-Sh, and Ko cohorts. The number of selected patients is shown in each box.

Close modal

According to the criteria and considering the statistical power to identify potential gene signature for prognosis, we collected fresh/frozen tumors and paired adjacent liver tissues from 100 consecutive HCC patients who underwent surgical resection at the Eastern Hepatobiliary Surgery Hospital in Shanghai (EHBH-Sh) between March 2009 and May 2010. Following the same criteria, the validation set contains 78 HCC patients from Mengchao Hepatobiliary Hospital of Fujian Medical University and 80 patients from the Affiliated Tumor Hospital of Guangxi Medical University who were treated with surgery between March 2007 and July 2012. The noncancerous liver tissues were isolated at least 2 cm away from the tumor border and confirmed to be free of tumor cells using microscopy. All samples were frozen immediately after surgery and preserved at −80°C. The local institutional review boards approved the inclusion of all tissues, and all patients signed the informed consent and were anonymous in research. Two other independent cohorts used for validation were from Liver Cancer Institute of Shanghai (LCI-Sh, Zhongshan hospital) and South Korea, comprising 247 and 286 HCC patients, respectively (11, 12).

Gene expression measured by microarray, sequencing, and RT-PCR

All 100 EHBH-Sh patient tissues were separated into three batches. Total 5 μg RNAs including mRNAs and lncRNAs from each sample were extracted using TRIzol (Invitrogen) to conduct microarray profiling (26 of 30 HCC patients in the first batch named as Sh-mic; Arraystar) and RNA sequencing (20 patients in the second batch designed as Sh-seq; Illumina). Raw data of microarrays and sequencing were normalized using the RMA method of NimbleScanV2.5 and through Cufflink software, respectively (13). Log2 transformation was performed to the normalized expression after transforming values of less than 10−5 to 10−5 for avoiding zero values.

For the third batch of 60 HCC patients, including 10 replicates measured by microarrays, we measured relative expression of the potential biomarkers using RT-PCR (designed as Sh-PCR). We also detected gene expression in HCC patients from Fujian and Guangxi cohorts using RT-PCR. TaqMan probes of RT-PCR (Supplementary Table S1) were designed and synthesized by Applied Biosystems. Gene expression was quantified by –ΔΔCt values using 18S rRNA expression as an internal control. The data are available at GEO (http://www.ncbi.nlm.nih.gov/geo) with the accession number GSE54238. Moreover, 228 and 181 patients separately from LCI-Sh and Korea were measured of expression profiles in both tumors and paired nontumor tissues using microarrays (GEO accession: GSE14520 and GSE36376; refs. 11, 12).

Selection, modeling, and validation of metastasis signature

Each batch of EHBH-Sh samples was further divided into two subgroups, which are typical low-risk (LRM) and high-risk metastatic (HRM) HCC as defined in previous reports (14, 15). Patients in LRM group were defined as: (i) primary HCC without venous metastases, even in endoscopic vision; and (ii) no recurrence within 2 years after liver resection. HRM Patients were the primary HCC with venous metastases which was defined by the presence of macroscopic tumor thrombus in major branch of the portal vein or inferior vena cava at diagnosis. We included 94 well-defined HCC patients with solitary nodule after excluding six patients who did not belong to LRM or HRM. Among those PCR-measured samples in Sh-PCR, Fujian, and Guangxi, we also excluded one sample with abnormal expression for each cohort. Abnormal expression was defined as over triple SEM expression in more than one gene of the metastasis signature.

We used the “limma” package to identify differentially expressed genes between the LRM and HRM groups (16). To compare expression profiles of the microarrays and sequencing, we centralized gene expression by subtracting the average expression and dividing the standard error for each batch of samples. We performed leave-one-out cross-validation to assess the potential of differentially expressed genes to predict the risk of metastasis using six prediction algorithms in BRB-ArrayTools (17). Meanwhile, we used hierarchical clustering to show the potential of gene set to divide LRM and HRM.

We carried out binary logistic regression with metastasis status as the dependent variable and the relative expression of metastasis signature as the covariate. The logistic regression with all or a subset of candidate genes was performed to identify the optimal formula to divide LRM and HRM using “glm” function (18). The model for predicting the risk of metastasis (named metastasis model herein) was a linear combination of the biomarkers, which were weighted by regression coefficients.

An ROC curve was used to assess the sensitivity and specificity of the metastasis model to separate HRM from LRM. The optimal cutoff for dividing LRM and HRM was set to the point with the maximal sum of sensitivity and specificity according to the ROC curve of Sh-PCR samples. Area under ROC curve (AUC) and 95% confidence interval (CI) of AUC were calculated using R package “pROC” (19).

Statistical analysis

All patients were monitored by abdomen ultrasonography, chest X-ray, and a test of serum α-fetoprotein (AFP), albumin, and total bilirubin every month during the first year after surgery and every 3 to 6 months thereafter until August 30, 2013. Two physicians unaware of the study performed the follow-ups through a CT scan or MRI of the abdomen every 6 months or immediately if recurrence was suspected. Once recurrent tumors were confirmed, further treatment was implemented on the basis of the tumor's diameter, number, location, vessel invasion, as well as liver function. Early recurrence was defined as tumor recurrence within 2 years after surgery. The RFS was calculated from the date of tumor resection until the detection of tumor recurrence, death, or last observation. The OS was defined as the length of time between the surgery and death or the last follow-up.

A multiple imputation approach was performed to handle missing data using R package “mi” (20). We compared survival difference between LRM and HRM using the Kaplan–Meier method and the log-rank test (21). Univariate and multivariate Cox proportional hazards regression analyses were performed to identify survival-related independent factors. We tested proportional hazards assumption for the Cox regression model based on weighted residuals (22). The covariates with P < 0.3 in univariate analyses and the known prognostic factors (e.g., BCLC) were included in multivariate analysis. All analyses were performed using the R statistical package (http://www.r-project.org/).

Patients

Among 100 HCC patients from EHBH-Sh, 94 were included in the training set according to the predefined criteria (Fig. 1). Initially, we measured expression of lncRNAs and mRNAs in tumor and adjacent liver tissues in 26 HCC patients using microarray and 20 patients through RNA sequencing. The other 48 HCC patients and 10 replicates used in the microarray measurements were detected relative expression of the refined gene signature by RT-PCR. Three batches of clinically well-defined HCC samples (separately named Sh-mic, Sh-seq, and Sh-PCR) contained a total of 52 LRM and 42 HRM HCC (Table 1 and Supplementary Table S2).

Table 1.

Clinical characteristics of HCC patients from three cohorts

VariableShanghai (EHBH-Sh)FujianGuangxiPa
Patients (male) 90 (96%) 61 (88%) 62 (87%) 0.10 
Age (year) 50.5 (44–56) 53 (44–61) 45 (38.5–55.5) 0.0077 
Tumor thrombus (positive)b 47 (50%) 31 (44.9%) 13 (18.3%) 5.4E−5 
Tumor envelope (positive) 41 (43.6%) 43 (62.3%) 41 (57.7%) 0.059 
Tumor size (cm) 7.0 ± 4.3 6.2 ± 4.1 4.6 ± 1.7 1.9E−4 
Cirrhosis (positive) 70 (74.5%) 55 (79.7%) 59 (83.1%) 0.0014 
Recurrence 41 (43.6%) 46 (66.7%) 36 (50.7%) 0.038 
AFP, α-fetoprotein (μg/L) 560 ± 574 823 ± 2768 692 ± 1257 0.63 
ALB, albumin (g/L) 42.5 ± 3.8 39.5 ± 4.7 41 ± 4.1 0.0003 
ALT, alanine transaminase (U/L) 52.4 ± 48.5 45.7 ± 30.2 48.9 ± 52.5 0.25 
BCLC (0:A:B:C) 2:18:65:7 0:52:10:7 0:41:21:8 9.0E−14 
VariableShanghai (EHBH-Sh)FujianGuangxiPa
Patients (male) 90 (96%) 61 (88%) 62 (87%) 0.10 
Age (year) 50.5 (44–56) 53 (44–61) 45 (38.5–55.5) 0.0077 
Tumor thrombus (positive)b 47 (50%) 31 (44.9%) 13 (18.3%) 5.4E−5 
Tumor envelope (positive) 41 (43.6%) 43 (62.3%) 41 (57.7%) 0.059 
Tumor size (cm) 7.0 ± 4.3 6.2 ± 4.1 4.6 ± 1.7 1.9E−4 
Cirrhosis (positive) 70 (74.5%) 55 (79.7%) 59 (83.1%) 0.0014 
Recurrence 41 (43.6%) 46 (66.7%) 36 (50.7%) 0.038 
AFP, α-fetoprotein (μg/L) 560 ± 574 823 ± 2768 692 ± 1257 0.63 
ALB, albumin (g/L) 42.5 ± 3.8 39.5 ± 4.7 41 ± 4.1 0.0003 
ALT, alanine transaminase (U/L) 52.4 ± 48.5 45.7 ± 30.2 48.9 ± 52.5 0.25 
BCLC (0:A:B:C) 2:18:65:7 0:52:10:7 0:41:21:8 9.0E−14 

aFisher exact test and ANOVA test were used separately for nominal and continuous variables.

bHere tumor thrombus including macroscopic and microscopic tumor thrombus.

There was no significant difference between LRM and HRM in age, status of complicated cirrhosis, and grade in any batch of samples (Supplementary Table S2). However, it is significantly different in metastasis-related characteristics, including tumor thrombus, tumor envelope, and satellite nodules in all three batches of samples.

Meanwhile, we randomly chose 78 and 80 HCC patients from the Fujian and Guangxi cohorts, respectively. In addition, another 228 and 181 HCC patients from LCI-Sh and Korea, respectively, were also used to validate the metastasis signature and model. Among the Shanghai, Fujian, and Guangxi cohorts, some heterogeneity was observed, especially in the status of tumor thrombus, tumor size, and BCLC stage (Table 1). In LCI-Sh cohort, 91.2% of HCC patients were HBV-positive, whereas 75.9% HCC patients were HBV-positive in Korea. The clinical heterogeneity indicated our study covered an extensive HCC cohort.

Metastasis-related gene signature

First, we identified differentially expressed genes by comparison of HRM to LRM HCC patients and then assessed their potential to divide HRM and LRM. Using microarray, we profiled 13 LRM and 13 HRM patients and calculated relative expression of mRNAs and lncRNAs in tumor compared with adjacent liver tissue for each patient. For HRM relative to LRM, we identified 168 lncRNAs and 155 mRNAs that were differentially expressed (fold change ≥2 and P < 0.001). We further performed RNA sequencing in another 10 LRM and 10 HRM patients to confirm that 8 lncRNAs and 43 mRNAs were differentially expressed (fold change ≥1.5 and P < 0.05) consistently with the microarray samples.

Using the leave-one-out cross-validation of six machine-learning algorithms in 46 HCC samples of Sh-mic and Sh-seq, we assessed the potential of predicting the risk of metastasis for each differentially expressed gene. Among 51 differential genes, 37 genes including 5 lncRNAs showed more than 70% average classification accuracy, and at least 60% classification accuracy for each algorithm. Moreover, these 37 genes could efficiently separate HRM from LRM, except for 5 misclassified samples (Supplementary Fig. S1A). Besides, lncRNAs or mRNAs could independently divide HRM and LRM (Supplementary Fig. S1B and S1C). These results imply that the expression of both mRNAs and lncRNAs contributes to the classification of LRM and HRM. From 37 potential genes, we further refined seven candidate genes whose combinational expression showed more than 90% average classification accuracy (Supplementary Table S3). These seven candidate genes could efficiently divide 23 HRM HCC patients from 23 LRM patients, of which only three samples were misclassified (Supplementary Fig. S2).

Using RT-PCR to another 33 LRM and 24 HRM HCC patients, we further confirmed the expression pattern of seven candidates. In this batch of patients, 5 LRM and 5 HRM patients were also measured by microarrays. We calculated the correlation of gene expression from microarray and RT-PCR to assess the reproducibility. A strong positive correlation was observed in all candidates except SPRY1, which may possibly be due to the limited number of patients (Supplementary Table S4). Moreover, all seven potential biomarkers were consistently downregulated in all three batches of patients.

Six-gene metastasis model and the prediction of clinical outcome

Next, we performed logistic regression to establish a metastasis-related model in 57 PCR-measured HCC patients for predicting the probability of tumor metastasis (Fig. 1). In the univariate analysis, we found that the expression of six out of seven candidate biomarkers showed significant association with the risk of metastasis (Supplementary Table S5). To identify a robust model, we explored a linear and weighted combination of the expression of potential biomarkers to divide metastatic HCC patients. Through an exhaustive search, we finally identified a model of six genes to calculate the MP. The model is as follows: |${\rm{MP}} = {\frac{{{e^{{\rm{MS}}}}}}{{1 + {e^{( {{\rm{MS}}} )}}}}$|⁠, where eMS is the natural exponential value of the metastasis score (MS), and MS = 0.51*AHCYL2 0.54*LAMP2 0.36*SPRY1 0.33*SERPINA7 0.33*FGGY 0.18*ASLNC16648 + 0.001.

To assess the sensitivity and specificity of the metastasis model, ROC analysis was performed in three batches of HCC samples from Shanghai cohort. The results indicated that the metastasis model can efficiently divide HRM and LRM samples from Sh-mic, Sh-seq, and Sh-PCR, whose AUC (area under curve) were 0.98 (95% CI, 0.93–1.0), 1.0 (95% CI, 1.0–1.0), and 0.85 (95% CI, 0.75–0.95), respectively (Fig. 2A). Using the leave-one-gene-out method, we observed that AUC values decreased after removing any gene in the model (Supplementary Table S6). This implied that all six genes are indispensable to efficiently predict HCC metastasis. According to the ROC curves of Shanghai samples, we identified that the optimal cutoff (α) of dividing HRM and LRM is 0.7 with 96% (95% CI, 85–99%) specificity and 74% (95% CI, 58–86%) sensitivity (Fig. 2A). In EHBH-Sh samples except of 10 replicates measured by RT-PCR, 60 HCC patients with MP ≤ 0.7 were predicted as LRM patients, and 33 patients with MP > 0.7 were predicted as HRM patients, of which 14% (13 of 93) samples were misclassified (Fig. 2B). These results suggested that MP could efficiently identify the risk of metastasis.

Figure 2.

Performance of six-gene metastasis model to separate HRM from LRM samples. A, An ROC curve of predicting tumor metastasis. The formula of calculating MS is illuminated at the top of the panel. The optimal cutoff α is marked using the tetragon. The number in the bracket of the legend indicates the AUC. B, The probability of metastasis was predicted by six-gene metastasis model. The horizontal dash line represents the threshold to divide LRM and HRM metastatic HCC. Cyan and red points represent the clinically defined LRM and HRM patients, respectively. Three batches of HCC samples from EHBH-Sh were measured through microarray (Sh-mic), RNA-seq (Sh-seq), and RT-PCR (Sh-PCR), respectively.

Figure 2.

Performance of six-gene metastasis model to separate HRM from LRM samples. A, An ROC curve of predicting tumor metastasis. The formula of calculating MS is illuminated at the top of the panel. The optimal cutoff α is marked using the tetragon. The number in the bracket of the legend indicates the AUC. B, The probability of metastasis was predicted by six-gene metastasis model. The horizontal dash line represents the threshold to divide LRM and HRM metastatic HCC. Cyan and red points represent the clinically defined LRM and HRM patients, respectively. Three batches of HCC samples from EHBH-Sh were measured through microarray (Sh-mic), RNA-seq (Sh-seq), and RT-PCR (Sh-PCR), respectively.

Close modal

Because HCC metastasis is strongly associated with poor survival of patients, we then analyzed whether the prediction from clinically well-defined HRM and LRM patients could be used to indicate HCC prognosis. Among 57 HCC patients in Sh-PCR, 17 patients were predicted to be HRM and showed significantly poorer OS than the rest of the 40 predicted LRM patients (P < 0.001; Supplementary Fig. S3). This indicated that the metastasis model-derived probability could be used to predict the clinical behavior of HCC patients, including OS.

Validation of the six-gene metastasis model through HCC prognosis analysis

To validate the prognostic efficiency of the metastasis model, we used RT-PCR to measure the expression of the metastasis signature in two independent cohorts, which contain 77 and 79 HCC patients from the Fujian and Guangxi, respectively (Fig. 1). As the samples in the Fujian and Guangxi cohorts were randomly selected and not clinically well-defined of metastasis, we assessed the metastasis model through prognosis analysis. Using the predefined 0.7 threshold, 17 and 14 HCC patients from Fujian and Guangxi, respectively, were assigned to the HRM group, and the rest of patients were in LRM group (Supplementary Fig. S4A and S4B). Survival analysis indicated that OS and RFS of HRM HCC were significantly shorter than those of LRM patients for both the Fujian and Guangxi cohorts (P < 0.05; Fig. 3A and B).

Figure 3.

Survival curves of HCC patients in multicenter cohorts. The division of LRM and HRM patients was based on 0.7 MP, which was predicted by six-gene metastasis model.

Figure 3.

Survival curves of HCC patients in multicenter cohorts. The division of LRM and HRM patients was based on 0.7 MP, which was predicted by six-gene metastasis model.

Close modal

For further external validation of the metastasis model, we used another two independent cohorts from LCI-Sh and Korea, which profiled the relative expression of 228 and 181 HCC patients, respectively, through microarrays. All six genes of the metastasis signature were measured in the Korean cohort, whereas the expression of all genes, except ASLNC16648, was available in the LCI-Sh cohort. Using the same cutoff, 91 of LCI-Sh patients and 39 of Korean patients were divided into HRM HCC (Supplementary Fig. S4C and S4D). Similar results were observed in both cohorts that the group of HRM patients showed poorer OS and RFS (Fig. 3C and D).

For the LCI-Sh cohort, Roessler and colleagues used 161 genes to predict the metastasis of HCC patients (12). On the basis of our six-gene signature, the average probability of HCC metastasis in the low-risk group split by the 161-gene classifier is remarkably lower than those of high-risk group of HCC patients (P < 0.001; Supplementary Fig. S5). Moreover, our model using much less number of genes could significantly stratify poor and favorable survival of HCC patients.

As HCC metastasis often leads to early recurrence of tumors (recur within two years after surgery), we further analyzed whether our six-gene metastasis model could also predict the prognosis of HCC patients with early recurrence. The results indicated that the HRM patients with early relapsed HCC showed significantly shorter RFS in all cohorts (P < 0.01; Fig. 4). Together, the analyses indicated that our six-gene metastasis model is robust to predict clinical behaviors of HCC patients, including survival and even early recurrence, in multicenter cohorts.

Figure 4.

RFS analysis of HCC patients with early recurrence. Early recurrence indicates HCC recurrence within 24 months after surgery. The probability of 0.7 was used to divide LRM and HRM risk of metastasis.

Figure 4.

RFS analysis of HCC patients with early recurrence. Early recurrence indicates HCC recurrence within 24 months after surgery. The probability of 0.7 was used to divide LRM and HRM risk of metastasis.

Close modal

Subgroup and multivariate analysis of the metastasis model

Furthermore, we analyzed whether our six-gene metastasis model could predict the prognosis in a subgroup of HCC, including those with solitary nodule and early stage tumor, for example, stage A of the BCLC staging system. There were 58 (85%), 40 (64.5%), and 183 (80%) HCC patients with single nodule from the Fujian, Guangxi, and LCI-Sh cohorts, respectively, whereas no nodule information is available in the Korean cohort. Significantly favorable OS and RFS was observed in the LRM and single-nodular patients in all three cohorts (Supplementary Fig. S6). Meanwhile, 51 (75%) HCC patients in Fujian, 36 (58%) patients in Guangxi, and 144 (63%) patients in the LCI-Sh cohort were in stage A of the BCLC staging system. Subgroup analysis of the HCC patients in stage A of BCLC indicated that both OS and RFS were poorer for HRM patients in these three cohorts (P < 0.05; Fig. 5). These results suggest that MP has the prognostic potential for a subgroup of HCC, especially for early tumors.

Figure 5.

Survival curves in a subgroup of HCC patients in stage A of BCLC. HCC patients for subgroup analysis of OS and RFS were stratified into LRM and HRM risk of metastasis using 0.7 probability as the cutoff.

Figure 5.

Survival curves in a subgroup of HCC patients in stage A of BCLC. HCC patients for subgroup analysis of OS and RFS were stratified into LRM and HRM risk of metastasis using 0.7 probability as the cutoff.

Close modal

After checking that the assumptions of Cox proportional hazards model were satisfactory, we next investigated if our six-gene metastasis signature is a survival-associated and independent risk factor. For Fujian, Guangxi, and LCI-Sh cohort, respectively, there were 46 (68%), 33 (53%), and 127 (56%) patients with tumor recurrence, and 24 (35%), 19 (31%), and 89 (39%) were in death. Univariate Cox analysis indicated that continuous MP is a significant risk factor for poor OS and RFS in the multicohorts (Supplementary Table S7). In the univariate analysis, several clinical characteristics including tumor size, multinodular tumors, and BCLC stage were also associated with survival. However, clinically metastasis-related features, for example, tumor envelopes were not significantly related to survival, suggesting MP may be more valuable than some traditional criteria. Multivariate analysis confirmed that continuous MP was a strong independent risk factor for poor OS and RFS (Table 2). In the Fujian cohort, MP was a significant risk factors for poor OS (HR = 6.24; 95% CI, 1.22–32; P = 0.028) and RFS (HR = 3.09; 95% CI, 1.0–9.5; P = 0.0495). Independent of BCLC, MP was significantly associated with the survival (HR = 16.6; 95% CI, 2.33–119; P = 0.0050 for OS; HR = 8.01; 95% CI, 1.78–36.1; P = 0.0067 for RFS) in the Guangxi cohort. In the LCI-Sh cohort, MP was significantly related to OS (HR = 3.03; 95% CI, 1.15–7.99; P = 0.025) and RFS (HR = 2.98; 95% CI, 1.36–6.55; P = 0.0064), and independent of multinodular, AFP, BCLC, and Cancer of the Liver Italian Program (CLIP). The results showed that MP is a very significant risk factor independent of well-known clinical indexes including the BCLC and CLIP staging systems.

Table 2.

Multivariate Cox analysis of the probability of HCC metastasis and clinicopathologic variables

VariableaOS: HR (95% CI)PRFS: HR (95% CI)P
Cohort Fujian 
Probability of metastasis 6.24 (1.22–32) 0.028 3.09 (1.0–9.5) 0.0495 
Tumor thrombus (yes vs. no) 2.51 (1.03–6.13) 0.044 1.46 (0.78–2.74) 0.24 
BCLC (0:A:B:C) 0.53 (0.26–1.11) 0.093 1.49 (0.93–2.39) 0.098 
Cohort Guangxi 
Probability of metastasis 16.6 (2.33–119) 0.0050 8.01 (1.78–36.1) 0.0067 
BCLC (0:A:B:C) 2.7 (1.34–5.44) 0.0057 2.82 (1.3–6.11) 0.0089 
Cohort LCI-Sh 
Probability of metastasis 3.03 (1.15–7.99) 0.025 2.98 (1.36–6.55) 0.0064 
Multinodular (yes vs. no) 0.40 (0.21–0.75) 0.0048 0.38 (0.21–0.70) 0.0018 
AFP (>300 μg/L vs. ≤300 μg/L) 0.44 (0.20–0.96) 0.038 0.57 (0.30–1.08) 0.084 
BCLC (0:A:B:C) 1.58 (1.06–2.36) 0.024 1.75 (1.21–2.53) 0.003 
CLIP (0:1:2:3:4:5) 2.26 (1.44–3.56) 0.00043 1.60 (1.07–2.40) 0.022 
VariableaOS: HR (95% CI)PRFS: HR (95% CI)P
Cohort Fujian 
Probability of metastasis 6.24 (1.22–32) 0.028 3.09 (1.0–9.5) 0.0495 
Tumor thrombus (yes vs. no) 2.51 (1.03–6.13) 0.044 1.46 (0.78–2.74) 0.24 
BCLC (0:A:B:C) 0.53 (0.26–1.11) 0.093 1.49 (0.93–2.39) 0.098 
Cohort Guangxi 
Probability of metastasis 16.6 (2.33–119) 0.0050 8.01 (1.78–36.1) 0.0067 
BCLC (0:A:B:C) 2.7 (1.34–5.44) 0.0057 2.82 (1.3–6.11) 0.0089 
Cohort LCI-Sh 
Probability of metastasis 3.03 (1.15–7.99) 0.025 2.98 (1.36–6.55) 0.0064 
Multinodular (yes vs. no) 0.40 (0.21–0.75) 0.0048 0.38 (0.21–0.70) 0.0018 
AFP (>300 μg/L vs. ≤300 μg/L) 0.44 (0.20–0.96) 0.038 0.57 (0.30–1.08) 0.084 
BCLC (0:A:B:C) 1.58 (1.06–2.36) 0.024 1.75 (1.21–2.53) 0.003 
CLIP (0:1:2:3:4:5) 2.26 (1.44–3.56) 0.00043 1.60 (1.07–2.40) 0.022 

aProbability of metastasis is continuous value.

Intrahepatic metastasis is a leading cause of early recurrence and poor prognosis in HCC patients, but the relevant signatures were limited and inconsistent to predict clinical behavior. In the study, we used diverse genome-wide biotechnologies to profile clinically well-defined and randomly selected HCC samples, identify and validate a six-gene metastasis signature in an unbiased manner. Using a predefined and fixed cutoff, the division of LRM and HMR patients was up to 96% specificity and 74% sensitivity, which may be overestimated as metastasis model was established in the same samples. Moreover, the six-gene metastasis model could not only divide HCC metastasis for predicting OS and RFS especially early stage of tumors but also associate with early recurrence. This suggests a great potential of the six-gene metastasis model in early prognosis of HCC patients. Although the multiple statistical tests were performed in univariate Cox analysis, the multivariate analysis with adjusted HRs indicated our six-gene metastasis signature is a significant survival-related risk factor independent of well-known BCLC staging system. This implies that HCC prognosis could be improved by the combination of six-gene metastasis signature with the existed staging system.

In the six-gene signature, LAMP2 is associated with tumor metastasis and early intrahepatic recurrence of HCC (23). The gene SPRY1 was reported to participate in cell adhesion and tumor migration (24). In prostate carcinoma, the expression of FGGY is significantly downregulated in patients with recurrence (25). The only lncRNA of our metastasis signature ASLNC16648 (YBX1P4) is regarded as a pseudogene of YBX1 (Y box binding protein 1), which could promote tumor metastasis (26). Intriguingly, it has been shown that YBX1 could upregulate the expression of LAMP2 and SPRY1 (27). This implies that the lncRNA ASLNC16648 may potentially interact with LAMP2 and SPRY1 to regulate HCC metastasis.

In conclusion, this new metastasis signature might have several advantages, including (i) it has a potential clinical application across multiplatforms, including microarray, RNA sequencing, and RT-PCR; (ii) it contains a small number of genes including a lncRNA but its prediction is highly consistent; and (iii) it is a significant survival-related risk factor and independent of pathologic features in multiple cohorts.

No potential conflicts of interest were disclosed.

Conception and design: S. Yuan, J. Wang, J. Zhang, L. Liu, W. Zhou

Development of methodology: J. Wang, Y. Yang, J. Zhang

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): S. Yuan, J. Zhang, H. Liu, B. Xiang, L. Li

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): S. Yuan, J. Wang, Y. Yang, J. Zhang, J. Xiao, L. Li, W. Zhou

Writing, review, and/or revision of the manuscript: S. Yuan, J. Wang, J. Zhang, W. Zhou

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): J. Zhang, Q. Xu, S. Zhu, L. Li

Study supervision: J. Zhang, J. Liu, L. Liu, W. Zhou

We thank Professor Fu Yang and Drs. Chuan-chuan Zhou, Xiaofei Ye, and Jin-zhao Ma for valuable discussion.

This work was funded by Chinese Key Project for Infectious Diseases (grant nos. 2012ZX10002010, 2012ZX10002016); National "863" project of China (2015AA020104); National “973” project of China (2014CB542102); Science Fund for Creative Research Groups, NSFC, China (grant no. 81221061); National Natural Science Foundation of China (grant nos. 81372207, 81572791, 81502375).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Jemal
A
,
Bray
F
,
Center
MM
,
Ferlay
J
,
Ward
E
,
Forman
D
. 
Global cancer statistics
.
CA Cancer J Clin
2011
;
61
:
69
90
.
2.
Bruix
J
,
Sherman
M
,
Practice Guidelines Committee, American Association for the Study of Liver Diseases
. 
Management of hepatocellular carcinoma
.
Hepatology
2005
;
42
:
1208
36
.
3.
Forner
A
,
Llovet
JM
,
Bruix
J
. 
Hepatocellular carcinoma
.
Lancet
2012
;
379
:
1245
55
.
4.
Imamura
H
,
Matsuyama
Y
,
Tanaka
E
,
Ohkubo
T
,
Hasegawa
K
,
Miyagawa
S
, et al
Risk factors contributing to early and late phase intrahepatic recurrence of hepatocellular carcinoma after hepatectomy
.
J Hepatol
2003
;
38
:
200
7
.
5.
Ye
Q-H
,
Qin
L-X
,
Forgues
M
,
He
P
,
Kim
JW
,
Peng
AC
, et al
Predicting hepatitis B virus–positive metastatic hepatocellular carcinomas using gene expression profiling and supervised machine learning
.
Nat Med
2003
;
9
:
416
23
.
6.
Nault
JC
,
De Reynies
A
,
Villanueva
A
,
Calderaro
J
,
Rebouissou
S
,
Couchy
G
, et al
A hepatocellular carcinoma 5-gene score associated with survival of patients after liver resection
.
Gastroenterology
2013
;
145
:
176
87
.
7.
Yuan
SX
,
Wang
J
,
Yang
F
,
Tao
QF
,
Zhang
J
,
Wang
LL
, et al
Long noncoding RNA DANCR increases stemness features of hepatocellular carcinoma by derepression of CTNNB1
.
Hepatology
2016
;
63
:
499
511
.
8.
Yuan
SX
,
Yang
F
,
Yang
Y
,
Tao
QF
,
Zhang
J
,
Huang
G
, et al
Long noncoding RNA associated with microvascular invasion in hepatocellular carcinoma promotes angiogenesis and serves as a predictor for hepatocellular carcinoma patients' poor recurrence-free survival after hepatectomy
.
Hepatology
2012
;
56
:
2231
41
.
9.
McShane
LM
,
Altman
DG
,
Sauerbrei
W
,
Taube
SE
,
Gion
M
,
Clark
GM
, et al
REporting recommendations for tumor MARKer prognostic studies (REMARK)
.
Nat Clin Pract Oncol
2005
;
2
:
416
22
.
10.
Collins
GS
,
Reitsma
JB
,
Altman
DG
,
Moons
KG
. 
Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement
.
Ann Intern Med
2015
;
162
:
55
63
.
11.
Lim
HY
,
Sohn
I
,
Deng
S
,
Lee
J
,
Jung
SH
,
Mao
M
, et al
Prediction of disease-free survival in hepatocellular carcinoma by gene expression profiling
.
Ann Surg Oncol
2013
;
20
:
3747
53
.
12.
Roessler
S
,
Jia
HL
,
Budhu
A
,
Forgues
M
,
Ye
QH
,
Lee
JS
, et al
A unique metastasis gene signature enables prediction of tumor relapse in early-stage hepatocellular carcinoma patients
.
Cancer Res
2010
;
70
:
10202
12
.
13.
Trapnell
C
,
Roberts
A
,
Goff
L
,
Pertea
G
,
Kim
D
,
Kelley
DR
, et al
Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks
.
Nat Protoc
2012
;
7
:
562
78
.
14.
Kim
BY
,
Suh
KS
,
Lee
JG
,
Woo
SR
,
Park
IC
,
Park
SH
, et al
Integrated analysis of prognostic gene expression profiles from hepatitis B virus-positive hepatocellular carcinoma and adjacent liver tissue
.
Ann Surg Oncol
2012
;
19
Suppl 3
:
S328
38
.
15.
Budhu
A
,
Forgues
M
,
Ye
QH
,
Jia
HL
,
He
P
,
Zanetti
KA
, et al
Prediction of venous metastases, recurrence, and prognosis in hepatocellular carcinoma based on a unique immune response signature of the liver microenvironment
.
Cancer Cell
2006
;
10
:
99
111
.
16.
Smyth
GK
. 
Limma: linear models for microarray data
.
Bioinformatics and computational biology solutions using R and Bioconductor
.
New York
:
Springer
; 
2005
.
p
.
397
420
.
17.
Simon
R
,
Lam
A
,
Li
M-C
,
Ngan
M
,
Menenzes
S
,
Zhao
Y
. 
Analysis of gene expression data using BRB-array tools
.
Cancer Inform
2007
;
3
:
11
.
18.
McCullagh
P
,
Nelder
JA
.
Generalized linear models.
London
:
Chapman and Hall
; 
1989
.
p
.
1
526
.
19.
Robin
X
,
Turck
N
,
Hainard
A
,
Tiberti
N
,
Lisacek
F
,
Sanchez
J-C
, et al
pROC: an open-source package for R and S+ to analyze and compare ROC curves
.
BMC Bioinformatics
2011
;
12
:
1
.
20.
Su
Y-S
,
Yajima
M
,
Gelman
AE
,
Hill
J
. 
Multiple imputation with diagnostics (mi) in R: opening windows into the black box
.
J Stat Soft
2011
;
45
:
1
31
.
21.
Kaplan
EL
,
Meier
P
. 
Nonparametric estimation from incomplete observations
.
J Am Stat Assoc
1958
;
53
:
457
81
.
22.
Grambsch
PM
,
Therneau
TM
. 
Proportional hazards tests and diagnostics based on weighted residuals
.
Biometrika
1994
;
81
:
515
26
.
23.
Uchimura
S
,
Iizuka
N
,
Tamesa
T
,
Miyamoto
T
,
Hamamoto
Y
,
Oka
M
. 
Resampling based on geographic patterns of hepatitis virus infection reveals a common gene signature for early intrahepatic recurrence of hepatocellular carcinoma
.
Anticancer Res
2007
;
27
:
3323
30
.
24.
Masoumi-Moghaddam
S
,
Amini
A
,
Ehteda
A
,
Wei
AQ
,
Morris
DL
. 
The expression of the Sprouty 1 protein inversely correlates with growth, proliferation, migration and invasion of ovarian cancer cells
.
J Ovarian Res
2014
;
7
:
61
.
25.
Fritzsche
S
,
Kenzelmann
M
,
Hoffmann
MJ
,
Muller
M
,
Engers
R
,
Grone
HJ
, et al
Concomitant down-regulation of SPRY1 and SPRY2 in prostate carcinoma
.
Endocr Relat Cancer
2006
;
13
:
839
49
.
26.
El-Naggar
AM
,
Veinotte
CJ
,
Cheng
H
,
Grunewald
TG
,
Negri
GL
,
Somasekharan
SP
, et al
Translational activation of HIF1alpha by YB-1 promotes sarcoma metastasis
.
Cancer Cell
2015
;
27
:
682
97
.
27.
Basaki
Y
,
Hosoi
F
,
Oda
Y
,
Fotovati
A
,
Maruyama
Y
,
Oie
S
, et al
Akt-dependent nuclear localization of Y-box-binding protein 1 in acquisition of malignant characteristics by human ovarian cancer cells
.
Oncogene
2007
;
26
:
2736
46
.