Abstract
Epithelial–mesenchymal transition (EMT) is thought to be an important mechanism of cancer cell metastasis. Clinical measurement of EMT markers in primary tumors could improve risk stratification and treatment decisions by identifying patients who potentially have metastatic disease. To evaluate the potential of EMT markers that could be used for risk stratification for patients with colorectal cancer, we conducted a systematic review of studies (N = 30) that measured at least one of a selection of EMT markers in primary tumors and patient outcomes. Fifteen of 30 studies (50%) reported at least one statistically significant result supporting a role for one of the selected EMT markers in identifying patients at risk for worse outcomes. Importantly, however, we identified design inconsistencies that limited inferences and prevented meta-analysis of data. We offer a number of recommendations to make future studies more informative and standardized, including consistent sampling of different parts of the primary tumor, larger sample sizes, and measurement of both protein and RNA expression of a given EMT marker in the same tumors. Strengthening the literature per our recommendations could facilitate translating EMT markers to clinical use. Cancer Epidemiol Biomarkers Prev; 23(7); 1164–75. ©2014 AACR.
Introduction
Roughly 80% of cancer is epithelial in origin and approximately 90% of cancer deaths are due to metastases (1). Clinical tools that are grounded in epithelial biology and can refine classification of tumor metastasis status have the potential to substantially improve cancer patient outcomes.
Epithelial–mesenchymal transition (EMT) is thought to be a critical mechanism of metastasis (2–4). It consists of epithelial cells temporarily shedding their epithelial characteristics and acquiring the characteristics of mesenchymal cells. This involves upregulation of molecules that induce EMT and mesenchymal markers, as well as downregulation of epithelial markers. These expression changes lead to phenotypic changes such as reducing adhesion between the transitioning cell and adjacent epithelial cells, secretion of enzymes that degrade the extracellular matrix, and cytoskeletal alterations that increase cellular motility.
Metastases consist of individual cancer cells that break off from the primary tumor and enter the circulation or lymphatic system (1). Though the vast majority of metastases are destroyed before embedding at a distant site in the body, those that survive require time to develop from single cells into detectable tumors (5). Consequently, patients with cancer could have undetected micrometastases at the time of primary tumor surgery. Measurement of EMT marker expression levels in cancer cells from a resected primary tumor could indicate whether the tumor has been producing cells capable of acting as metastases, even if metastases are not detected.
Epithelial cells have been shown to undergo EMT within 48 hours of beginning exposure to an EMT inducer (6, 7). This provides a window of time during which the cancer cell remains attached to the tumor, but with altered expression levels of epithelial and mesenchymal markers. Measuring markers of metastasis directly in the primary tumor is an appealing way to determine whether the tumor is potentially metastatic.
Information on EMT marker expression levels in the primary tumor could help clinicians to determine the best treatment approaches for their patients. Specifically, EMT markers could improve risk stratification and guide decisions about adjuvant chemotherapy. For example, suppose a primary tumor is discovered that seems to be stage II. Currently, surgery alone would be considered sufficient treatment (8). However, if EMT marker measurements taken in the tumor suggest that it could be producing metastases, there might be a role for systemic chemotherapy in an attempt to destroy micrometastases too small to be detected via imaging.
To explore the potential of translating EMT markers to clinical cancer care, we posed the following questions: What criteria should define a clinically useful EMT marker? What is the most promising EMT marker for oncologic clinical purposes? To clarify these issues, we conducted a systematic review of the literature on measuring the relationship between EMT marker expression levels in resected primary tumor specimens and patient outcomes. Because clinically useful EMT markers for a given tumor site may not be clinically useful for other tumor sites (3, 9, 10), we confined ourselves to a single site: colorectal cancer. Our review could serve as a model for analogous reviews in other tumor sites.
Materials and Methods
Literature search
We sought to identify original journal articles that measured EMT marker expression levels in human clinical colorectal cancer primary tumor specimens and related those measurements to patient outcomes. The search strategy consisted of four groups of terms: EMT, tumor markers, outcomes, and colorectal cancer (see Supplementary Data). To be retrieved, an item had to have at least one term from each of the four groups. The tumor marker group included both broad and specific terms to retrieve papers on any related markers while assuring that no studies of markers prominently discussed in the EMT literature (2, 4) were missed. We searched PubMed, EMBASE, and BIOSIS through May 2 2013 using the same search terms for all databases. Neither publication date nor language limitations were applied to the search, nor were publication type filters used.
Search results of 745 abstracts (Fig. 1) were downloaded to an EndNote database. Two hundred and sixty-seven duplicates were removed, yielding 478 unique abstracts. To determine which abstracts examined clinical colorectal cancer specimens and therefore warranted full-text inspection, we designated a primary abstract reviewer (E.L. Busch) to read the 478 abstracts and create a spreadsheet recording the following for each: (i) which markers the study examined, (ii) whether it looked at cell lines, animals, and/or clinical tissue specimens, (iii) whether the paper was a review, and (iv) whether the abstract was for a meeting presentation. Abstracts for papers that did not look at clinical colorectal cancer tissue specimens and meeting presentations were excluded.
As a check of the primary reviewer's accuracy per established guidelines (11), we designated a secondary abstract reviewer (R.S. Sandler) to independently read a sample of 50 unique abstracts and determine which looked at clinical colorectal cancer specimens. Subsequent joint review of the results led to agreement that the primary reviewer's initial classifications were correct for 47 of the 50 dual-reviewed abstracts (94%). Of the three misclassified abstracts, two were cases in which the primary reviewer opted to inspect the full texts of abstracts that did not warrant it. We concluded that 2% of unique abstracts might have merited full-text inspection but did not receive it, though most of these items likely would not have qualified for inclusion in the final set of papers (Fig. 1).
On the basis of the 478 abstracts, 105 seemed to have used clinical colorectal cancer specimens and therefore warranted examination of the full text. We designated a primary full-text reviewer (E.L. Busch) to read all papers. Inspection revealed that 1 was a case report and 1 study was not in English. These two were excluded, leaving 103 original journal articles that examined EMT markers in colorectal cancer tissue specimens. The primary reviewer then created a second spreadsheet documenting the following information for each of the 103 papers: (i) sample size, (ii) whether Kaplan–Meier survival curves were presented, (iii) whether effect estimates (e.g., HRs, ORs) were calculated, (iv) whether correlations between EMT marker levels and other measurements were presented, (v) whether the percentage of cases found to have positive expression of EMT markers was provided, and (vi) whether any measures of reliability were reported. Measurements of patient outcomes were defined as regression estimates of effect or Kaplan–Meier analyses.
We designated a secondary full-text reviewer (K.A. McGraw) to independently read all 103 papers and determine whether each measured (i) EMT markers in colorectal cancer primary tumors and (ii) the relationship of markers with outcomes. Subsequent comparison of the results led to agreement that the primary reviewer had correctly assessed all 103 papers.
We found that 37 papers measured at least one EMT marker in clinical colorectal cancer primary tumors and evaluated the relationship between EMT marker expression and patient outcomes. Between them, these papers measured dozens of markers or categories of markers (“categories of markers” meaning, e.g., measurement of multiple miRNAs counts as simply “miRNA”). Because evaluating such a large number of markers was beyond the scope of a single paper, we restricted our discussion to the 14 markers and marker categories specified in our search terms: E-cadherin, N-cadherin, vimentin, Snail, Slug, cytokeratins, integrins, fibronectin, Twist, ZEB1, ZEB2, β-catenin, TGFβ, and miRNAs. We excluded seven papers that measured a variety of markers but not one of the selected markers (12–18), leaving a final set of 30 papers that were evaluated for the markers of interest, though many of them measured other markers as well.
For each of the 30 papers, we recorded the following additional information for each EMT marker that the paper measured from our selection: whether marker expression was measured as protein and/or RNA, how the study defined positive expression for the marker, and whether the paper presented Kaplan–Meier analyses stratified by expression levels of that particular marker.
Data extracted from papers
Definition of positive/negative expression
Expression of a marker above a threshold stated in a paper is positive and expression below the threshold is negative.
Positive and negative expression refers to the degree to which a marker is expressed, and not to whether the marker's expression is indicative of changes in cellular phenotype. For example, a cancer cell expressing high levels of the epithelial marker E-cadherin and the mesenchymal marker vimentin would be considered positive for both. However, the significance of positive expression would differ between the markers. Vimentin-positive status suggests that the cell is undergoing EMT, whereas E-cadherin–positive status suggests the opposite.
Definitions of marker-expression status varied considerably across studies, even when measuring the same marker using the same technique. For example, consider the two studies that measured N-cadherin using IHC. Kouso and colleagues defined marker-positive status as positive staining in at least 10% of tumor cells (19). Fan and colleagues assigned a score on a scale of 0–8 based on staining intensity and extent, then defined positive expression as a score of at least 4.5 (20).
Our decision not to combine subjects from different studies in a meta-analysis was based on differences in definitions of positive expression across studies. A tumor classified as marker positive under one definition might be classified as marker negative under another definition (21, 22).
Percent-positive expression
We defined percent positive expression as the percentage of subjects whose tumor specimens positively expressed a given marker.
Survival
We looked for Kaplan–Meier analyses that stratified subjects by EMT marker expression status, that is, presented separate survival curves for subjects whose tumors exhibit different levels of marker expression. Such stratified analyses can suggest whether the expression level of an EMT marker is related to patient outcomes.
Measures of effect
Some studies performed regression modeling in which expression of EMT markers were included as independent variables with patient outcomes as the dependent variable. Such measures of effect were most often HRs from Cox proportional hazards models of patient time-to-mortality, but could also be ORs from logistic regression models. These estimates directly evaluate associations between EMT marker expression in primary tumor cancer cells and patient outcomes.
Relevant correlations
We noted correlations between measurements of EMT marker expression levels and other marker expression levels, tumor characteristics, and other quantities of interest. These relationships can suggest useful information such as whether one EMT marker can act as a surrogate for another, or whether some tumor characteristic such as tumor budding is correlated with marker expression levels indicative of metastasis.
Reliability
Reliability indicates the degree of consistency or repeatability when measuring a marker. We considered two kinds of reliability for studies that measured marker protein expression using IHC which often involves manual scoring. First was inter-rater reliability, the consistency of measurements when different people score the same specimens. Second was intra-rater reliability, the consistency of measurements when the same person scores a specimen multiple times.
Results
Because most studies measured markers only as protein, results refer to protein measurements unless noted otherwise. Results across studies are summarized for protein in Table 1 and for RNA in Table 2. We distinguish between markers of epithelial phenotype (epithelial markers), markers of mesenchymal phenotype (mesenchymal markers), and markers whose elevated expression can induce EMT (EMT inducers). EMT is suggested by low levels of epithelial markers and by high levels of EMT inducers and mesenchymal markers.
. | Percent positiveb . | Survival . | Effect estimates . | |||
---|---|---|---|---|---|---|
Marker . | # Studies . | Range . | # Studies . | # With difference by marker status . | # Studies . | # With effect on outcomes . |
β-catenin | 5 | 42%–90% | 2 | 0 | 1 | 0 |
Cytokeratins | 2 | 9%–85% | 2 | 1 | 1 | 0 |
E-cadherin | 15 | 29%–92% | 4 | 3 | 3 | 1 |
Fibronectin | 0 | — | 0 | 0 | 0 | 0 |
Integrins | 2 | 19%–88% | 1 | 1 | 1 | 1 |
N-cadherin | 2 | 0%–44% | 0 | 0 | 1 | 0 |
Slug | 2 | 30%–37% | 2 | 1 | 1 | 1 |
Snail | 5 | 40%–79% | 3 | 2 | 2 | Mixed |
TGFβ | 2 | 82%–89% | 1 | 0 | 0 | 0 |
Twist | 4 | 48%–100% | 1 | 1 | 3 | Mixed |
Vimentin | 4 | 0%–49% | 0 | 0 | 0 | 0 |
ZEB1 | 1 | 29% | 1 | 1 | 0 | 0 |
ZEB2 | 2 | 48%–90% | 1 | 1 | 1 | Mixed |
. | Percent positiveb . | Survival . | Effect estimates . | |||
---|---|---|---|---|---|---|
Marker . | # Studies . | Range . | # Studies . | # With difference by marker status . | # Studies . | # With effect on outcomes . |
β-catenin | 5 | 42%–90% | 2 | 0 | 1 | 0 |
Cytokeratins | 2 | 9%–85% | 2 | 1 | 1 | 0 |
E-cadherin | 15 | 29%–92% | 4 | 3 | 3 | 1 |
Fibronectin | 0 | — | 0 | 0 | 0 | 0 |
Integrins | 2 | 19%–88% | 1 | 1 | 1 | 1 |
N-cadherin | 2 | 0%–44% | 0 | 0 | 1 | 0 |
Slug | 2 | 30%–37% | 2 | 1 | 1 | 1 |
Snail | 5 | 40%–79% | 3 | 2 | 2 | Mixed |
TGFβ | 2 | 82%–89% | 1 | 0 | 0 | 0 |
Twist | 4 | 48%–100% | 1 | 1 | 3 | Mixed |
Vimentin | 4 | 0%–49% | 0 | 0 | 0 | 0 |
ZEB1 | 1 | 29% | 1 | 1 | 0 | 0 |
ZEB2 | 2 | 48%–90% | 1 | 1 | 1 | Mixed |
NOTE: Mixed, at least one study of a given marker reported both statistically significant and nonsignificant estimates of the effect of marker expression levels on outcomes.
aThe number of studies under percentage of tumors with positive marker expression, survival, or effect estimates refers to the number of studies that reported values for that measurement for a given marker.
bDefinitions of positive marker expression in a tumor generally varied across multiple studies of a given marker.
. | Percent positiveb . | Survival . | Effect estimates . | |||
---|---|---|---|---|---|---|
Marker . | # Studies . | Range . | # Studies . | # With difference by marker status . | # Studies . | # With effect on outcomes . |
miRNAs | 2 | 48%–76% | 2 | 2 | 0 | 0 |
Twist | 1 | 86% | 1 | 1 | 1 | 1 |
. | Percent positiveb . | Survival . | Effect estimates . | |||
---|---|---|---|---|---|---|
Marker . | # Studies . | Range . | # Studies . | # With difference by marker status . | # Studies . | # With effect on outcomes . |
miRNAs | 2 | 48%–76% | 2 | 2 | 0 | 0 |
Twist | 1 | 86% | 1 | 1 | 1 | 1 |
aThe number of studies under percentage of tumors with positive marker expression, survival, or effect estimates refers to the number of studies that reported values for that measurement for a given marker.
bDefinitions of positive marker expression in a tumor varied across multiple studies of a given marker.
β-catenin
We found five studies that measured the mesenchymal marker β-catenin in colorectal cancer tissue (19, 20, 23–25). The two that looked at survival by β-catenin status found no difference in outcomes between marker-positive and marker-negative subjects (23, 24). The only study that looked at effect estimates did not find any effect of β-catenin measurements on outcomes (20). Percentage of subjects with positive expression varied particularly by location in the cell (nuclear, cytoplasmic, membranous), but was generally in the 40% to 50% range. Correlations between β-catenin levels and location in tumor mass were inconsistent.
Cytokeratins
Of five studies that stained for cytokeratins, three used them only as a background stain (26–28). Of the other two studies, one found that cytokeratin-8–positive subjects had better survival than cytokeratin-8–negative subjects (29). Cytokeratin-14–negative cases had better survival than cytokeratin-14–positive cases. The percentage of subjects with positive expression was high for cytokeratin-8 (85%) and moderate for cytokeratin-14 (59%). The last study found positive cytokeratin-7 expression in 9% of tumors and no effect on outcomes or difference in survival based on cytokeratin-7 expression (30).
E-cadherin
We found 20 papers that measured E-cadherin in clinical colorectal cancer tissue. Of these, four that measured protein (23, 30–32) and one that measured RNA (33) failed to provide information on the number of tumors considered marker positive. Among the other 15 studies, a wide range of definitions of marker-positive status were used (17, 19, 20, 25, 34–44). Percentage of subjects with positive expression varied but mainly fell between 30% and 70%. Four studies looked at survival by E-cadherin status, with three finding poorer survival associated with reduced E-cadherin expression (34, 37, 42) and one finding no difference by E-cadherin status (39). Three studies looked at effect estimates, with two finding no effect of E-cadherin levels on outcomes (20, 36) and one finding that E-cadherin levels do affect patient outcomes (42).
Fibronectin
No paper measured fibronectin in clinical colorectal cancer specimens and related fibronectin levels to patient outcomes.
Integrins
Two papers looked at members of the integrin family of proteins, which are mesenchymal markers in the context of EMT. One study found 19% of subjects with positive expression for integrin α-5-β-1 and 88% of subjects with positive expression for integrin α-3-β-1 but did not look at the association of either protein with outcomes (43). The other, much larger study found 37% of subjects with positive expression for integrin α-v-β-6 and clear differences in survival and effect estimates for the protein's relationship with outcomes (45).
miRNAs
Three papers measured miRNAs (miR), but one (33) did not provide information on how many tumors were considered marker positive. One study reported that subjects with high expression of miR-19b and miR-194 had shorter survival than those with low expression (46). The other study found shorter survival for subjects whose tumors had low expression of miR-212 compared with those with high expression (47).
N-cadherin
Two studies measured the mesenchymal marker N-cadherin in clinical colorectal cancer specimens. One of them that included 10 subjects did not find positive N-cadherin expression in any of their tumors, and did not look at the relationship of N-cadherin with outcomes (19). The other study found 44% of subjects with positive expression and, while it did not look at survival by N-cadherin status, calculated effect estimates and found no effect of N-cadherin on outcomes (20). However, the latter study's 193 subjects were divided into training and testing sets before effect estimates were calculated, thus reducing the power for each estimate.
Slug
Three studies measured the EMT inducer Slug, though one of them (48) that measured RNA failed to provide information on the number of tumors considered marker positive. Of the other two studies, one with a sample size of 10 patients found 30% of subjects with positive expression in primary tumors, and observed no difference in survival by Slug status (19). The other study found 37% of subjects with positive expression (42). Slug-positive patients in this last study had poorer survival than Slug-negative patients, and using HRs, the authors concluded that Slug was an independent prognostic factor of outcomes.
Snail
Five studies measured the EMT inducer Snail. Three found 40% to 55% of subjects with positive expression (19, 20, 36) and two in the range of 75% to 80% (39, 49), though the studies used a variety of definitions of Snail-positive status. Of the three studies that looked at survival, two found worse survival in Snail-positive subjects than Snail-negative subjects (19, 49), and the other study found no difference by Snail expression status (39). The two studies that did not look at survival did look at effect estimates, and each obtained mixed within-study results (20, 36).
TGFβ
Two studies measured members of the TGFβ class of EMT inducers. One study of TGFβ-R2 found almost 90% of subjects with positive expression and no difference in survival by TGFβ status (24). The other study measured TGFβ-1 and, while finding a wide distribution of expression across subjects, did not relate these measurements to outcomes (26).
Twist
Five studies looked at the EMT inducing Twist family in clinical colorectal cancer specimens. One study (48) that measured mRNA of Twist1 found that 86% of subjects showed positive expression and that those with positive expression had worse survival than those with negative expression. Survival differences were especially large among early-stage subjects. Via effect estimates, Twist levels had an effect on outcomes. The other four studies measured protein. One with 10 subjects found 100% of subjects with positive expression but did not look at outcomes (19). Another two studies found roughly 50% of subjects with positive expression and did not look at survival (20, 36). Both calculated effect estimates with mixed results. The fifth study measured Twist2 and found that Twist2-positive subjects had worse survival than Twist2-negative subjects (44). It also obtained statistically significant effect estimates of the impact of Twist2 expression on outcomes.
Vimentin
Of the five studies that measured this mesenchymal marker, one (33) measured mRNA and did not provide information on how many specimens were considered marker positive. The other four studies measured protein. Of these, two (19, 30) found 0% of subjects with positive expression—despite using different definitions of marker-positive status—and another (40) found 9% of subjects with positive expression. The fourth study reported 49% of tumors as vimentin positive (41), but in this case, the investigators defined marker-positive status as anything above the median staining value, making half the tumors positive by definition. None of these four studies looked at survival or effect estimates for vimentin expression.
ZEB1
In the three studies that measured the EMT inducer ZEB1, two (27, 32) did not report the percentage of subjects with positive expression. The other found 29% of subjects with positive expression and that ZEB1-positive patients had much shorter average survival than ZEB1-negative patients (31 months vs. 67 months, respectively; ref. 40).
ZEB2
Two studies measured the EMT inducer ZEB2. One with 10 subjects found 90% of subjects with positive expression and did not look at survival or effect estimates (19). The other study found that 48% of the tumors were ZEB2 positive at the tumor invasion front and 41% were ZEB2 positive at the tumor center (50). This study reported that 73% of the primary tumors had greater ZEB2 expression at the invasion front compared with the tumor center. ZEB2-positive patients had poorer survival than ZEB2-negative patients. In terms of effect estimates, ZEB2 levels at the invasion front were a predictor of outcomes, but ZEB2 expression at the tumor center was not.
Discussion
EMT markers are possibly important clinical tools to aid oncologists in more accurately assessing whether a tumor is potentially metastatic. For example, current guidelines do not recommend systemic chemotherapy for stage II colorectal cancer (8). Although the overall survival for those diagnosed with local or regional disease is high (51), some of these patients do develop metastatic disease and die. If clinicians could identify those with metastatic potential based on EMT markers, such patients might be candidates for chemotherapy. Those without metastatic potential based on EMT markers would be spared chemotherapy. Practically, fulfillment of this promise requires identifying not only the best marker or combination of markers of EMT, but also determining the most clinically useful definition of marker expression levels.
To assess the possibility of translating EMT markers to colorectal cancer care, we examined studies relating EMT marker expression in primary colorectal tumors with patient outcomes. Overall, the literature examined a wide variety of EMT markers. This makes sense given that EMT consists of a global change in cellular phenotype in which expression levels of numerous proteins and RNA transcripts change. The cascade of cellular alterations means that there are many candidate markers of the transition for potential clinical use. The quality of the literature is critical to assessing the possibility of translating any particular EMT marker or combination of markers to the clinic.
Quality and limitations of studies
Collectively, the papers examined in this review provide encouragement that EMT markers may be translated to clinical colorectal cancer care. We found that 15 of 30 papers (50%) reported at least one statistically significant result—effect estimate or difference between marker expression-stratified survival curves—that supported a role for EMT markers measured in primary colorectal cancer tumors in predicting patient outcomes (19, 20, 29, 34, 36, 37, 40, 42, 44–50). This is impressive given that the 30 studies had relatively small sample sizes (mean, 213.3 subjects; range, 10–566, with 80% having less than 260 subjects). Although it is possible that publication bias may have skewed the proportion with a significant finding upward, the proportion only accounts for our selection of markers. Some of the 15 papers that did not report any significant results for our selection of markers reported significant findings for other markers such as CD44v6 (41) and RKIP (31).
Each of the 30 studies used a hospital-based sample rather than a population-based sample. The advantages of hospital-based samples in studies of EMT markers and patient outcomes are convenience, less variation in tumor specimen preservation and storage, and greater control over which portions of the tumor are sampled. For example, previous authors have noted that the invasive front of a tumor may be an especially rich source of metastases because many cancer cells there undergo EMT as part of invasion (3, 52). Therefore, researchers may be especially interested in sampling the invasive front to measure expression levels of their markers of interest. In a population-based sample drawing on specimens from many different clinics, it may be more difficult to ensure that specific parts of the tumor are sampled, especially if studying EMT was not the primary research objective in collecting the population-based sample. The advantage of a population-based sample is greater external validity of inferences (53).
The results reveal several weaknesses in the literature that make it difficult to determine which EMT markers and which definitions of the expression status of those markers, are most strongly associated with colorectal cancer patient outcomes. Understanding these weaknesses is critical to designing more informative studies in the future.
First, there is no consistent definition of positive marker expression across studies of the same marker using the same technique. This inconsistency hampers comparisons and prevents valid meta-analytic combination of data across studies. This last point is important because of the small sample sizes noted earlier. A further issue related to defining positive marker expression is that some studies use scoring techniques that may not be optimally reproducible, such as manual immunohistochemical scoring rather than computer-assisted image analysis.
Second, 28 of 30 articles (93%) provided Kaplan–Meier survival analyses, with 25 of 30 (83%) presenting survival curves stratified by marker expression status for at least one EMT marker (i.e., separate curves for marker-positive and marker-negative subjects, or for high, medium, and low expression). Among the papers that did stratify for at least one marker, the authors did not always present stratified survival curves for every EMT marker studied. Survival curves cannot suggest whether marker expression status impacts survival unless they are stratified by marker expression status.
Of 30 studies, 22 (73%) provided estimates of measures of effect, typically using Cox time-to-event analyses. Such estimates are critical for any future meta-analyses and should always be performed for every EMT marker studied. Among studies that estimated measures of effect, the independent variables in the models typically included EMT marker expression and a set of covariates. Studies varied considerably in terms of which covariates were included in the models (Table 3), as well as how EMT marker expression and other variables were coded. These modeling considerations should be consistent across studies to make results comparable and to increase the validity of any future meta-analytic combination of estimates. Some studies that calculated effect estimates for at least one EMT marker did not calculate estimates for every EMT marker measured.
First author . | EMT markers examined . | Type of regression . | Adjustment covariates . |
---|---|---|---|
Bates et al. (45) | Integrin α-v-β-6 | Cox | Age, sex, tumor type (colon vs. rectal), tumor stage |
Bellovin et al. (34) | E-cadherin | Cox | Age, sex, tumor stage, lymph node status |
Bellovin et al. (35) | E-cadherin | Cox | Age, sex, tumor stage, lymph node status |
Fan et al. (20) | E-cadherin, N-cadherin, Twist, Snail, β-catenin | Logistic | Support vector machine, tumor stage, CEA, CA19-9, CA125 |
Fan et al. (36) | E-cadherin, Twist, Snail | Logistic | Age, sex, histologic grade, tumor class |
Fujikawa et al. (37) | E-cadherin | Cox | Age, pathologic T category, lymph node metastasis |
Gomez et al. (48) | Twist, Slug (RNA only for both) | Cox | Age, sex, lymph node metastasis status, tumor stage, treatment protocol (e.g., chemo vs. surgery) |
Harbaum et al. (30) | Cytokeratin-7, E-cadherin | Cox | Age, sex, T classification, N classification, tumor grade |
Kahlert et al. (50) | ZEB2 | Cox | Age, sex, tumor stage, grade, type of resection (curative vs. noncurative), microsatellite stability, KRAS mutation status |
Kevans et al. (23) | E-cadherin, β-catenin | Cox | Age, sex, T-stage, tumor site, tumor differentiation, tumor budding, lymphovascular invasion, neural invasion, microsatellite status |
Khanh et al. (26) | Cytokeratin, TGFβ | Cox | Tumor grade, tumor depth, node status |
Knosel et al. (29) | Cytokeratins | Cox | Not listed |
Knosel et al. (38) | E-cadherin | Cox | Pathologic stage, venous invasion, PITX1 expression |
Koelzer et al. (31) | E-cadherin | Cox | T-stage, M-stage, N-stage, postoperative therapy |
Kroepil et al. (39) | E-cadherin, Snail | Cox | Age, sex, T-stage, N-stage, tumor grade |
Meng et al. (47) | miRNA | Cox | Tumor size, CEA, histologic grade |
Mesker et al. (24) | TGFβ, β-catenin | Cox | No adjustment covariates reported |
Saito et al. (41) | E-cadherin, vimentin | Cox | Age, sex, location (colon vs. rectal), tumor size, histological type, lymphatic invasion, venous invasion, invasion of primary tumor, lymph node metastasis |
Shioiri et al. (42) | E-cadherin, Slug | Cox | Age, distant metastasis, lymph node metastasis, lymphatic invasion, vessel invasion |
Spaderna et al. (27) | E-cadherin, β-catenin, ZEB1, vimentin | Cox | Not listed |
Yu et al. (44) | E-cadherin, Twist2 | Cox | Age, sex, T-stage, N-stage, M-stage, tumor differentiation, vascular invasion, tumor location, CEA level |
Zlobec (28) | Cytokeratin | Cox | Tumor budding, TNM stage, tumor grade, KRAS status, BRAF status, MGMT status, microsatellite instability status, CpG island methylator status |
First author . | EMT markers examined . | Type of regression . | Adjustment covariates . |
---|---|---|---|
Bates et al. (45) | Integrin α-v-β-6 | Cox | Age, sex, tumor type (colon vs. rectal), tumor stage |
Bellovin et al. (34) | E-cadherin | Cox | Age, sex, tumor stage, lymph node status |
Bellovin et al. (35) | E-cadherin | Cox | Age, sex, tumor stage, lymph node status |
Fan et al. (20) | E-cadherin, N-cadherin, Twist, Snail, β-catenin | Logistic | Support vector machine, tumor stage, CEA, CA19-9, CA125 |
Fan et al. (36) | E-cadherin, Twist, Snail | Logistic | Age, sex, histologic grade, tumor class |
Fujikawa et al. (37) | E-cadherin | Cox | Age, pathologic T category, lymph node metastasis |
Gomez et al. (48) | Twist, Slug (RNA only for both) | Cox | Age, sex, lymph node metastasis status, tumor stage, treatment protocol (e.g., chemo vs. surgery) |
Harbaum et al. (30) | Cytokeratin-7, E-cadherin | Cox | Age, sex, T classification, N classification, tumor grade |
Kahlert et al. (50) | ZEB2 | Cox | Age, sex, tumor stage, grade, type of resection (curative vs. noncurative), microsatellite stability, KRAS mutation status |
Kevans et al. (23) | E-cadherin, β-catenin | Cox | Age, sex, T-stage, tumor site, tumor differentiation, tumor budding, lymphovascular invasion, neural invasion, microsatellite status |
Khanh et al. (26) | Cytokeratin, TGFβ | Cox | Tumor grade, tumor depth, node status |
Knosel et al. (29) | Cytokeratins | Cox | Not listed |
Knosel et al. (38) | E-cadherin | Cox | Pathologic stage, venous invasion, PITX1 expression |
Koelzer et al. (31) | E-cadherin | Cox | T-stage, M-stage, N-stage, postoperative therapy |
Kroepil et al. (39) | E-cadherin, Snail | Cox | Age, sex, T-stage, N-stage, tumor grade |
Meng et al. (47) | miRNA | Cox | Tumor size, CEA, histologic grade |
Mesker et al. (24) | TGFβ, β-catenin | Cox | No adjustment covariates reported |
Saito et al. (41) | E-cadherin, vimentin | Cox | Age, sex, location (colon vs. rectal), tumor size, histological type, lymphatic invasion, venous invasion, invasion of primary tumor, lymph node metastasis |
Shioiri et al. (42) | E-cadherin, Slug | Cox | Age, distant metastasis, lymph node metastasis, lymphatic invasion, vessel invasion |
Spaderna et al. (27) | E-cadherin, β-catenin, ZEB1, vimentin | Cox | Not listed |
Yu et al. (44) | E-cadherin, Twist2 | Cox | Age, sex, T-stage, N-stage, M-stage, tumor differentiation, vascular invasion, tumor location, CEA level |
Zlobec (28) | Cytokeratin | Cox | Tumor budding, TNM stage, tumor grade, KRAS status, BRAF status, MGMT status, microsatellite instability status, CpG island methylator status |
Abbreviations: TNM, tumor–node–metastasis; CEA, carcinoembryonic antigen.
Overall, the literature has paid insufficient attention to marker measurement reliability. Of 27 studies that measured protein using IHC, only 12 (44%) reported any consideration of reliability (19, 20, 26, 30, 35, 36, 39, 41, 42, 44, 49, 50). All 12 were instances of inter-rater reliability in which at least two researchers scored the slides and then compared the results. One study of E-cadherin and RhoC reported that about 15% of slides initially received different scores (35). Another study that measured E-cadherin, Snail, and Twist reported initial discrepancies on about 8% of slides (36). Two studies noted that scoring disagreements were resolved via simultaneous reexamination using a multiheaded microscope (30, 50). Collectively, little other information was provided about the extent of agreement or how discrepancies were resolved. No study reported any assessment of intra-rater reliability.
Perhaps the best study to date in the literature on EMT marker expression and colorectal cancer patient outcomes is the one by Bates and colleagues (45). In addition to having one of the largest sample sizes in the literature (N = 488), it presents survival curves stratified by marker expression status as well as HRs from Cox modeling to estimate the association between marker expression and patient time-to-death. Several other papers provide similarly thorough analyses, but are weaker due to smaller sample sizes (42, 48, 50).
Criteria for a clinically useful EMT marker
We propose seven criteria for judging whether a particular EMT marker might be useful as a clinical tool to determine whether a primary tumor is potentially metastatic:
Biologic role: The role of the marker in the EMT mechanism should be well understood and critical.
Percentage of subjects with positive expression: A marker that is positive for nearly 0% or 100% of patients is unlikely to provide much information about the prognosis for different patients given how common metastasis is (54). Markers for which there are appreciable numbers of both marker-positive and marker-negative tumors are likeliest to be clinically useful.
Reliability: A useful EMT marker for clinical purposes will exhibit a high degree of reliability when measured in clinical colorectal cancer specimens in the same way. The reliability of EMT markers is difficult to assess at present because of inconsistencies in study design throughout the literature. Such inconsistencies make comparisons between studies difficult, even between studies that measured the same marker using the same laboratory technique.
Validity: To determine the sensitivity or specificity of EMT markers, one would need to compare marker expression levels with a “gold standard” measure of whether a tumor is metastatic. It is not clear what could constitute such a gold standard in this context. The closest candidate might be clinical diagnosis of metastases using imaging techniques. However, this is not wholly satisfactory because the goal is to identify metastases before they reach a size that is detectable by imaging. Correlation between EMT marker expression levels and imaging detection of metastases is a helpful, but not definitive demonstration of the validity of an EMT marker for clinical purposes. The reason is that correlating primary tumor EMT marker expression levels with clinical detection of metastases only involves cases where metastases are detected. However, a clinically useful EMT marker will provide information about the likelihood that the primary tumor is metastatic even in cases where no metastases are detected.
Association with patient outcomes: If the expression levels of a marker play a role in generating metastases and the marker is to serve as a clinical indicator of whether the primary tumor is metastatic, then the expression levels should be associated with patient time-to-mortality, that is, the length of the time interval between primary tumor surgery and patient death. This can be assessed using Cox proportional hazards modeling and Kaplan–Meier curves stratified by marker expression levels.
Amount of prior data: Clinical utility ought to be supported by as many studies as possible, each of which includes as many subjects as possible. All else being equal, we have more confidence in the marker where evidence is based on a greater number of subjects.
Ability to measure the marker: If a marker is difficult to measure accurately in a clinical setting, its utility is limited no matter how strong the evidence for it may be according to the other criteria described above.
On the basis of the above criteria and the literature examined, E-cadherin seems to be the most promising EMT marker for clinical purposes. Its biologic role in EMT is central and well understood (2). As a direct mediator of cell–cell adhesion, decreases in its membranous expression play a critical part in cancer cells acquiring the ability to break away from the tumor. Though definitions of positive expression varied, most studies that measured E-cadherin found substantial numbers of both marker-positive and marker-negative subjects. More studies stratified survival by E-cadherin expression status than for any other marker. Most of these papers reported poorer survival for E-cadherin–negative subjects compared with E-cadherin–positive subjects, as would be expected for this epithelial marker.
This is not to say that the evidence conclusively demonstrates that E-cadherin levels are an important measure of whether the primary tumor could be metastatic. Indeed, the inconsistent study designs and analyses discussed earlier mean that the data are merely suggestive. However, the evidence does suggest that, were future studies to adopt consistent, reproducible definitions and procedures, E-cadherin seems the best candidate to emerge in the future as a useful marker of EMT for colorectal cancer care.
Recommendations
Our review of the literature on EMT marker expression in primary colorectal cancer tumors and patient outcomes suggests several considerations for future studies, most of which are likely to apply to EMT marker/patient outcome studies in other tumor sites (Table 4).
|
|
First, for any given EMT marker, expression measurement needs to become more standardized and reproducible across investigator groups. For studies measuring protein expression of markers using IHC, computer-assisted image analysis should be used because it produces results that are highly correlated with those obtained by manual scoring, and can provide a continuous measure of staining intensity (55). Obtaining continuous marker expression data will allow more flexibility in determining clinically relevant cut points to define tumor marker status compared with the ordinal data produced by manual scoring (e.g. 0, 1+, 2+, 3+). The scoring algorithm should be set up to restrict analysis to the one cellular compartment most relevant to the marker's biologic function in EMT. For example, membranous expression of transmembrane proteins such as cadherins or integrins should be scored, but not cytoplasmic or nuclear expression of these markers. Similarly, nuclear expression of transcription factors such as Snail, Slug, and Twist should be scored but not their expression in other compartments. Restricting expression measurement to one biologically relevant compartment will simplify scoring algorithms and may lead to more clinically applicable results. When annotating tumor slide images between staining and computerized image scoring, investigators should be careful to mark the images so that only cancer cells are scored and not other tumor components such as stroma, endothelial cells, or fibroblasts.
Second, larger sample sizes are needed. Adoption of standardized definitions of tumor marker status as recommended above would increase the ability to perform valid meta-analyses of data on the same marker measured in the same way from different studies, thereby improving precision.
Third, statistical analyses should be thorough and consistent across studies. Every study of EMT markers in colorectal cancer tumors and patient outcomes should provide Kaplan–Meier curves stratified by expression levels of every marker studied, both each marker individually and in combination with other markers studied. For example, if a study measured E-cadherin and Vimentin, it should present three separate Kaplan–Meier plots: one stratified by E-cadherin expression, one by vimentin expression, and one by E-cadherin and vimentin expression jointly. Furthermore, every study should perform Cox proportional hazards modeling with EMT marker expression as an independent variable and the time from tumor surgery to patient mortality as the dependent variable. Studies should present univariate Cox analyses for each EMT marker studied as well as multivariate analyses. Adjustment covariates used in the multivariate models should be consistent across studies. The coding of all variables used in Cox modeling should also be consistent across studies of the same marker using the same laboratory technique. Unfortunately, past studies have not consistently performed both Cox regression and stratified Kaplan–Meier analyses for every EMT marker that they studied (Table 1).
One consideration worthy of future development is identifying the set of covariates that ought to be included in multivariate Cox analyses to provide valid estimates of associations between EMT marker expression levels and patient time-to-mortality. The covariates included in models varied widely across the papers, typically with little justification provided for the choices (Table 3). In addition to identifying which covariates ought to be adjusted for in Cox models, consideration should be given to the optimal coding of covariates.
Fourth, it is not clear whether protein or RNA measurements of a given marker's expression would be more useful for clinical purposes. Whenever possible, we suggest the investigators to measure both protein and RNA expression of a marker in the same tumors, then report Cox modeling results for each so that point estimates and precision measures can be compared.
Fifth, we recommend that investigators perform evaluations of inter-rater and intra-rater reliability, especially for immunohistochemical studies of marker protein expression. Using computer-assisted image analysis would not eliminate the need to address this point. Any algorithms developed as part of marker measurement should have their reliability externally assessed and stated in reports. For instance, if an algorithm is developed to distinguish between epithelial and nonepithelial cells in images of tissue for the purpose of automating annotations, the algorithm's reliability could be assessed by manually annotating a portion of images and comparing results.
Finally, for clinical purposes it is important to know to what extent a given EMT marker's expression tends to be heterogeneous throughout a tumor mass. It is likely that the tumor's invasive front is the crucial portion that needs to be sampled to determine whether the tumor might be producing metastases, but the evidence addressing this point needs to be strengthened. Thus, we recommend that investigators routinely measure EMT markers in multiple cores from every tumor studied, including the invasive front, tumor center, and surface of the tumor away from the invasive front. Expression and modeling data from these different tumor portions should be presented in reports to evaluate marker expression heterogeneity within a tumor.
Conclusion
Together, the papers examined here suggest that the theoretical clinical value of EMT markers has promise in practice and merits further study. At present, the literature is hampered by small sample sizes and inconsistent methods. By increasing the standardization, reproducibility, and thoroughness of study designs and data analysis, future studies may strengthen the quality of the evidence produced and make possible valid meta-analyses that will improve precision. This work could clarify which EMT marker and which expression levels of that marker, can best assist clinicians in making optimal treatment decisions. Strengthening the literature per the recommendations outlined here may facilitate translating EMT markers to clinical use.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.