Purpose: The early detection of colorectal cancer (CRC) is crucial for successful treatment and patient survival. However, compliance with current screening methods remains poor. This study aimed to identify an accurate blood-based gene expression signature for CRC detection.

Experimental Design: Gene expression in peripheral blood samples from 216 patients with CRC tumors and 187 controls was investigated in the study. We first conducted a microarray analysis to select candidate genes that were significantly differentially expressed between patients with cancer and controls. A quantitative reverse transcription PCR assay was then used to evaluate the expression of selected genes. A gene expression signature was identified using a training set (n = 200) and then validated using an independent test set (n = 160).

Results: We identified an 18-gene signature that discriminated the patients with CRC from controls with 92% accuracy, 91% sensitivity, and 92% specificity. The signature performance was further validated in the independent test set with 86% accuracy, 84% sensitivity, and 88% specificity. The area under the receiver operating characteristics curve was 0.94. The signature was shown to be enriched in genes related to immune functions.

Conclusions: This study identified an 18-gene signature that accurately discriminated patients with CRC from controls in peripheral blood samples. Our results prompt the further development of blood-based gene expression biomarkers for the diagnosis and early detection of CRC. Clin Cancer Res; 19(11); 3039–49. ©2013 AACR.

Translational Relevance

Colorectal cancer (CRC) is the leading cause of cancer-related deaths worldwide. The early detection of CRC is crucial for successful treatment and patient survival. However, compliance with current screening methods remains poor and there is a clear need for an accurate in vitro blood test to increase participation in CRC screening. This study shows that the gene expression profiles of peripheral blood samples can be used to distinguish between patients with CRC and controls. An 18-gene signature was identified and validated as highly sensitive and specific for detecting CRC in blood samples. These results open an avenue for the further development of blood-based gene expression biomarkers for the diagnosis and early detection of CRC.

Colorectal cancer (CRC) is the third most common malignancy and the fourth most common cause of cancer-related mortality worldwide (1). In 2008, more than 1 million cases were newly diagnosed and more than 600,000 people died from the disease (2). Given its slow development from removable precancerous lesions and curable early stages, screening for CRC has the potential to reduce both the incidence and mortality of the disease (3). The available screening tools include fecal occult blood testing (FOBT), stool DNA tests, flexible sigmoidoscopy, computed tomographic (CT) colonography, and colonoscopy. Different screening strategies are preferred in various countries. However, compliance with current CRC screening recommendations remains poor. The most reliable screening tool, colonoscopy, is invasive, costly, and conducted infrequently. In contrast, the currently most widely used noninvasive screening option, FOBT, has important limitations, including inconvenience and low sensitivity.

The discovery of novel biomarkers based on the analysis of blood samples has become a focus of current research. A novel blood biomarker may offer several practical advantages compared with the currently used screening approaches. First, in vitro blood tests are safe and minimally invasive. Second, no dietary restriction, colon cleansing, or sedation is required. Third, the sample collection and processing procedures may be easier and more convenient. Furthermore, there is no microflora that could degrade the biomarker or hamper the analysis. Thus, a sensitive and specific in vitro blood test that detects CRC in the currently noncompliant patient population will improve participation in screening, increase the accuracy of detection, and therefore save lives.

Several studies have shown that molecular biomarkers in the peripheral blood can be used to develop in vitro tests for the detection of CRC. The DNA methylation biomarker methylated Septin 9 (mSEPT9) previously showed a sensitivity of 70% and a specificity of 90% for discriminating patients with CRC from controls (4, 5). Recently, the performance of mSEPT9 was determined in a prospective study of 7,940 average-risk individuals undergoing colonoscopy for CRC screening. The sensitivity in the 45 patients with CRC was 67%, with a specificity of 88% (6). DNA microarray technology can quantify the expression of several thousand genes simultaneously and may be able to capture the complex biology that underlies colorectal tumorigenesis and progression better than single gene markers. A number of studies have been published, using DNA microarray technology to identify blood-based gene expression signatures for CRC detection. Han and colleagues reported a 5-gene signature with 88% sensitivity and 64% specificity (7). Marshall and colleagues recently published a 7-gene signature with a sensitivity of 72% and a specificity of 70% (8). Rosenthal and colleagues also reported a panel of 202 genes with 90% sensitivity and 88% specificity (9).

In this study, we tested the hypothesis that gene expression profiling of peripheral blood cells could yield diagnostic information in a cohort of Chinese patients. An 18-gene signature was identified and validated as highly sensitive and specific for detecting CRC in blood samples.

Study design and patients

Study participants were recruited from the Fudan University Shanghai Cancer Center and Shanghai Qibao Community Hospital from 2006 through 2010 (Shanghai, China). All the participants were Chinese. For the CRC group, all patients had blood collected before surgery. None of the patients with cancer had received preoperative radiotherapy or chemotherapy before blood collection. The tumors were staged according to the tumor–node–metastasis (TNM) system. Patients suffering from hereditary CRC were excluded. For the control group, FOBT-positive participants without any symp-toms of inflammatory bowel diseases, polyps, or CRC, which had been confirmed by colonoscopy, were enrolled from a population-based screening program. Figure 1 depicts 3 different phases of the study design and Table 1 summarizes the clinical characteristics of the samples in the study.

Figure 1.

Study design. The blood gene expression profiles of 216 patients with CRC and 187 controls were investigated in 3 different phases. We first conducted a microarray analysis to select candidate genes that were significantly differentially expressed between patients with cancer and controls. qRT-PCR assays were then applied to evaluate the expression of selected genes. A gene expression signature was identified using a training set (n = 200) and then validated using an independent test set (n = 160). *, TNM stage was not available for 1 patient in the test set. FUSCC, Fudan University Shanghai Cancer Center.

Figure 1.

Study design. The blood gene expression profiles of 216 patients with CRC and 187 controls were investigated in 3 different phases. We first conducted a microarray analysis to select candidate genes that were significantly differentially expressed between patients with cancer and controls. qRT-PCR assays were then applied to evaluate the expression of selected genes. A gene expression signature was identified using a training set (n = 200) and then validated using an independent test set (n = 160). *, TNM stage was not available for 1 patient in the test set. FUSCC, Fudan University Shanghai Cancer Center.

Close modal
Table 1.

Characteristics of the CRC and control populations

VariableDiscovery setTraining setTest set
CRC (n = 100)Control (n = 100)CRC (n = 100)Control (n = 100)CRC (n = 87)Control (n = 73)
Age, y 
 Mean 57.6 56.5 56.3 56.3 56.2 56.3 
 Range 27–78 38–74 27–78 38–74 26–78 32–74 
Sex, no. (%) 
 Male 50 (50.0) 50 (50.0) 49 (49.0) 48 (48.0) 50 (57.5) 17 (23.3) 
 Female 50 (50.0) 50 (50.0) 51 (51.0) 52 (52.0) 37 (42.5) 56 (76.7) 
Tumor site, no. (%) 
 Colon 41 (41.0) — 51 (51.0) — 27 (31.0) — 
 Rectum 59 (59.0)  49 (49.0)  60 (69.0)  
Tumor stage, no. (%)a 
 Stage I 16 (16.0) — 11 (11.0) — 11 (12.6) — 
 Stage II 36 (36.0)  40 (40.0)  31 (35.6)  
 Stage III 24 (24.0)  22 (22.0)  31 (35.6)  
 Stage IV 24 (24.0)  27 (27.0)  13 (14.9)  
VariableDiscovery setTraining setTest set
CRC (n = 100)Control (n = 100)CRC (n = 100)Control (n = 100)CRC (n = 87)Control (n = 73)
Age, y 
 Mean 57.6 56.5 56.3 56.3 56.2 56.3 
 Range 27–78 38–74 27–78 38–74 26–78 32–74 
Sex, no. (%) 
 Male 50 (50.0) 50 (50.0) 49 (49.0) 48 (48.0) 50 (57.5) 17 (23.3) 
 Female 50 (50.0) 50 (50.0) 51 (51.0) 52 (52.0) 37 (42.5) 56 (76.7) 
Tumor site, no. (%) 
 Colon 41 (41.0) — 51 (51.0) — 27 (31.0) — 
 Rectum 59 (59.0)  49 (49.0)  60 (69.0)  
Tumor stage, no. (%)a 
 Stage I 16 (16.0) — 11 (11.0) — 11 (12.6) — 
 Stage II 36 (36.0)  40 (40.0)  31 (35.6)  
 Stage III 24 (24.0)  22 (22.0)  31 (35.6)  
 Stage IV 24 (24.0)  27 (27.0)  13 (14.9)  

aTNM stage was not available for 1 patient in the test set.

In the discovery set, we analyzed the whole-blood gene expression profiles of 100 patients with CRC and 100 controls using GeneChip U133plus2 microarrays (Affymetrix). There were no significant differences in the distribution of age or gender between the CRC and control groups. The CRC group included 41 patients with colon cancer and 59 with rectal cancer. Sixteen of the patients with cancer were stage I, 36 were stage II, 24 were stage III, and 24 were stage IV. The significance analysis of microarray (SAM) method was used to identify genes that were differentially expressed between the CRC and control groups (10). The list of genes was further refined by expression signal intensity, fold change, biologic annotation, and probe set grade. Finally, 52 unique genes were selected for further testing by quantitative reverse transcription PCR (qRT-PCR).

The training set consisted of 100 patients with CRC and 100 controls, including 71 patients with CRC and 86 controls that were also used in the discovery set to assess the correspondence between the microarray and qRT-PCR measurements. The remaining 43 samples from the discovery set were not applicable for qRT-PCR experiments due to low RNA concentrations and, thus, were replaced these with new samples. The distributions of age and gender were balanced between the CRC and control groups. The CRC group included 51 patients with colon cancer and 49 with rectal cancer. Eleven of the patients with cancer were stage I, 40 were stage II, 22 were stage III, and 27 were stage IV. These samples were used as the training set to identify the gene expression signature for differentiation between the CRC group and the control group.

In the test set, we used an independent cohort of 87 patients with CRC and 73 controls. The CRC group included 27 patients with colon cancer and 60 with rectal cancer. Eleven of the patients with cancer were stage I, 31 were stage II, 31 were stage III, and 13 were stage IV. The TNM variables of one patient were not available. The fully specified gene expression signature from the training set was applied to the test set for validating the signature performance in the independent samples.

The study was approved by the Institutional Review Board of Fudan University Shanghai Cancer Center and written informed consent was obtained from all participants.

Blood collection and RNA extraction

For each participant, 2.5 mL of peripheral blood was collected into PAXgene Blood RNA tubes, and the total RNA was extracted with the PAXgene Blood RNA System (PreAnalytiX). The quantity of total RNA was measured with a spectrophotometer at an optical density of 260 nm, and the quality was assessed using the RNA 6000 Nano LabChip Kit on an Agilent 2100 Bioanalyzer (Agilent Technologies). All samples met the quality criterion: RNA integrity number > 7.0.

Microarray hybridization

For each sample, 50 ng of total RNA were reversely transcribed and linearly amplified as single-stranded cDNA using Ribo-SPIA technology with the WT-Ovation RNA Amplification System (NuGEN Technologies), and the products were purified using the QIAquick PCR Purification Kit (Qiagen). A total of 2 μg of amplified and purified cDNA were subsequently fragmented with RQ1 RNase-Free DNase (Promega) and labeled with biotinylated deoxynucleoside triphosphates using Terminal Transferase (Roche Diagnostics) and GeneChip DNA Labeling Reagent. The labeled cDNA was hybridized onto the GeneChip U133plus2 microarray in a Hybridization Oven 640 (Agilent Technologies) at 60 rpm at 50°C for 18 hours. After hybridization, the arrays were washed and stained according to the Affymetrix protocol EukGE-WS2v4 using a GeneChip Fluidics Station 450. The arrays were scanned with a GeneChip Scanner 3000.

qRT-PCR

For the training set and the test set, qRT-PCR using SYBR Green assays was conducted according to the manufacturer's instructions. For each sample, a total of 320 ng of RNA were reversely transcribed into single-stranded cDNA using the QuantiTect Reverse Transcription Kit (Qiagen). The cDNA was amplified using the SYBR Premix DimerEraser (Perfect Real Time) Kit (Takara Biotechnology). The amplification was detected in real time using the Applied Biosystems 7900HT Fast Real-Time PCR System (Life Technologies). The primers were designed primarily within the target sequences of the selected Affymetrix probe sets with Beacon Designer software (Premier Biosoft). The primer sequences are provided in Supplementary Table S1. The primer pairs were experimentally validated with the following criteria: (i) a single gene-specific product was produced; (ii) the amplification efficiency ranged between 90% and 110%; and (iii) the cycle threshold (Ct) value of the no-template control was more than 35.

Six candidate reference genes (ACTB, CRY2, CSNK1G2, DECR1, FARP1, and TRAP1) that have been reported to be consistently expressed in human whole-blood samples were selected for investigation (11). The expression stability of the 6 reference genes in the training set were estimated using 4 commonly used algorithms: geNorm (12), NormFinder (13), BestKeeper (14), and the comparative cycle threshold (ΔCt) method (15). Each algorithm ranked the reference genes from most stable (rank #1) to least stable (rank #6). The overall ranking of candidate reference genes was calculated according to the RefFinder method described by Chen and colleagues (16). Briefly, the geometric means of the 4 ranking numbers of each gene were calculated, and then candidate reference genes were ranked according to the geometric mean, the gene with the smaller geometric mean being the most stable reference gene. As a result, CSNK1G2, DECR1, and FARP1 were shown to be the most stable genes among the candidate list. Therefore, these 3 genes were selected as reference genes and their geometric mean was used as a normalization factor for qRT-PCR data normalization.

Statistical analysis

Microarray data were analyzed using R software and packages from the Bioconductor project (17, 18). Raw data had been deposited in the ArrayExpress public repository and were accessible through the accession number E-MTAB-1532. Raw data were normalized using the robust multichip average (RMA) method (19). The probe set level data were log2-transformed. In addition, we applied a bioinformatics-based filtering approach using information in the Entrez Gene Database (20). Probe sets without Entrez Gene ID annotation were removed. For multiple probe sets mapping to the same Entrez Gene ID, only probe sets showing the largest interquantile range were kept, and the rest were excluded. The analysis of differentially expressed genes was conducted using the SAM method implemented in the “samr” package (10).

The qRT-PCR–based gene expression levels were estimated using the comparative ΔCt method of relative quantification (21), normalizing the Ct values relative to the normalization factor. The relative fold change was represented as |$2^{-\Delta \Delta C_{\rm t}}$|⁠, where ΔΔCt = mean ΔCt CRC − mean ΔCt Control. We used the minimum redundancy maximum relevance (mRMR) algorithm for gene selection (22), the support vector machine (SVM) algorithm for classification (23), and the leave-one-out-cross-validation (LOOCV) procedure for sampling. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were conducted using the GeneCodis bioinformatics tool (24–26).

Microarray analysis and candidate gene selection

The recent release of the U133plus2 array contains more than 54,000 probe sets, which represent approximately 38,500 human genes. Confronted with such an overwhelming amount of information, it was necessary to reduce the total number of genes analyzed to a manageable number of genes with well-characterized biologic information and use visualization schemes to facilitate the recognition of patterns in the data (27). We thus conducted a bioinformatics-based filtering procedure to summarize the probe sets at the gene level and exclude those probe sets with low-grade biologic annotations. After filtering, the expression profiles of 8,662 unique genes in 100 patients with CRC and 100 controls were retained for downstream analysis.

Using the SAM method, we identified 263 genes that were differentially expressed between the CRC group and the control group, among which 179 genes were upregulated and 84 genes were downregulated in the patients with cancer. Using the 263 differentially expressed genes, a hierarchical clustering analysis showed that 78 of 100 controls and 76 of 100 patients with CRC were correctly classified (Supplementary Fig. S1). The gene list was further refined by multiple criteria: (i) an average expression intensity of greater than 64; (ii) a fold change in mean expression intensity of greater than 1.2; (iii) genes with known biologic function; and (iv) a high-grade probe design. This analysis resulted in 52 candidate biomarkers for further qRT-PCR study.

Identification of an 18-gene signature in the training set

The training set included 100 patients with CRC and 100 controls, among which 71 patients with CRC and 86 controls were also included in the discovery set. First, we compared the expression profiles of the 52 selected genes measured by both microarray and qRT-PCR. For each gene, the fold change between the 71 patients with CRC and 86 controls was calculated. Spearman correlation analysis showed that the fold change of candidate genes measured by microarray and qRT-PCR was highly comparable [r = 0.94; 95% confidence interval (CI), 0.90–0.98; P < 0.001], although a few exceptions were observed. Three genes were considered outliers because their expression profiles were not consistent between the microarray and qRT-PCR analyses.

Signature identification was subsequently conducted using the SVM classification model, with the selection of significant genes based on the mRMR method, through repetitions of the LOOCV process. As shown in Fig. 2, our process could conceptually be broken into 6 steps:

Figure 2.

Gene signature identification process. Our process could be conceptually broken down into 6 steps: 1, the samples were divided into an inner training set and an inner test set. 2, using only the inner training set, the mRMR algorithm returned a subset of n genes that have maximum relevance with the clinical status and minimum redundancy within the gene set. 3, the n-gene–based SVM classification model was built using the linear kernel and fitted to the inner training set. 4, the developed model was used to predict the class of the test sample. The predicted class was compared with the true class label of the sample. If they disagreed, the prediction was in error. 5, the process was repeated leaving each of the 200 samples out of the training set, one at a time. A total of 200 different models was created, and each model was used to predict the class of the test sample. Eventually, the number of prediction errors was summed up and reported as the LOOCV estimate of the prediction error. 6, steps 1 to 5 were repeated 52 times starting with n = 1, increasing one at time until n = 52. The LOOCV performance for each of the n-gene signatures was estimated and compared to determine the optimal size of the signature.

Figure 2.

Gene signature identification process. Our process could be conceptually broken down into 6 steps: 1, the samples were divided into an inner training set and an inner test set. 2, using only the inner training set, the mRMR algorithm returned a subset of n genes that have maximum relevance with the clinical status and minimum redundancy within the gene set. 3, the n-gene–based SVM classification model was built using the linear kernel and fitted to the inner training set. 4, the developed model was used to predict the class of the test sample. The predicted class was compared with the true class label of the sample. If they disagreed, the prediction was in error. 5, the process was repeated leaving each of the 200 samples out of the training set, one at a time. A total of 200 different models was created, and each model was used to predict the class of the test sample. Eventually, the number of prediction errors was summed up and reported as the LOOCV estimate of the prediction error. 6, steps 1 to 5 were repeated 52 times starting with n = 1, increasing one at time until n = 52. The LOOCV performance for each of the n-gene signatures was estimated and compared to determine the optimal size of the signature.

Close modal
  1. The samples were divided into an inner training set and an inner test set. The inner test set consisted of only a single sample; the remaining 199 samples were placed in the inner training set. The sample in the test set was placed aside and not used in the development of the class prediction model.

  2. Using only the inner training set, the mRMR algorithm was used to search for a subset of n genes that had maximum relevance with the clinical status and minimum redundancy within the gene set.

  3. The n-gene based SVM model was built using the linear kernel and was fitted to the inner training set.

  4. The developed model was used to predict the class of the test sample. The prediction was based on the expression profile of the test sample, without using knowledge of the true class of the sample. The predicted class was compared with the true class label of the sample. If they disagreed, the prediction was in error.

  5. Then, a new training set–test set partition was created. This time another sample was placed in the inner test set, and all of the other samples were placed in the inner training set. A new classification model was constructed using the samples in the new training set. Although the same algorithm for gene selection and parameter estimation was used, because the new model was constructed on the basis of the new training set, it would in general not select exactly the same gene set as the previous model. Again, the model was applied to the expression profile of the test sample. If the predicted class did not agree with the true class label of the test sample, then the prediction was in error. The process was repeated leaving each of the 200 biologically independent samples out of the training set, one at a time. During the steps, 200 different models were created and each model was used to predict the class of the test sample. Eventually, the number of prediction errors was totaled and reported as the LOOCV-based error rate.

  6. The number of genes (n) to be selected by mRMR algorithm was set as a predefined variable. Steps 1 to 5 were repeated 52 times, starting with n = 1, adding 1 gene at time until the n = 52. The LOOCV-based performance for each of n-gene signature was estimated and used to determine the optimal size of the signature.

At the end of the process, 10,400 different classification models were constructed to assess signature performance in association with the number of genes included in the signature. As shown in Fig. 3A, the classification accuracy was 53.5% when only 1 gene was included in the signature. The performance accuracy increased to 90.0% when 9 genes were included in the signature. However, the classification accuracy decreased to 85.5% when another gene was added, implying that the observed performance might not be truly stable at a signature size of approximately 10. The performance increased continuously and reached 90.5% and 91.5% accuracy at signature sizes of 15 and 18, respectively. Subsequently, the signature performance increased to 92.5% and 93.0% accuracy with signature sizes of 38 and 48, respectively.

Figure 3.

Identification of the 18-gene signature. A, the classification accuracy during the LOOCV process for each n-gene signature was estimated and was used to determine the optimal size of the signature. The size determination took into account the classification accuracy and signature complexity. Ultimately, we considered a signature size of 18 to be ideal. The results of LOOCV process showed that an estimated performance of 91.5% accuracy was achievable with a signature composed of the top 18 genes. B, univariate fold changes of 18 genes across 100 patients with CRC and 100 controls in the training set. Positive fold changes indicate genes that are upregulated in CRC, whereas negative fold changes indicate genes that are downregulated in CRC.

Figure 3.

Identification of the 18-gene signature. A, the classification accuracy during the LOOCV process for each n-gene signature was estimated and was used to determine the optimal size of the signature. The size determination took into account the classification accuracy and signature complexity. Ultimately, we considered a signature size of 18 to be ideal. The results of LOOCV process showed that an estimated performance of 91.5% accuracy was achievable with a signature composed of the top 18 genes. B, univariate fold changes of 18 genes across 100 patients with CRC and 100 controls in the training set. Positive fold changes indicate genes that are upregulated in CRC, whereas negative fold changes indicate genes that are downregulated in CRC.

Close modal

Not surprisingly, the classification performance increased continuously as more genes were added to the signature. However, signature size determination also needs to consider signature complexity; when more genes are included in the signature, the signature becomes more complex and less generalizable. Our strategy was to identify a signature that showed satisfactory performance but maintained a compact signature size and reasonable complexity for future development. Although the prediction accuracy could be improved from 91.5% to 93% by adding 30 additional genes (signature size from 18 to 48), this input:output ratio was not truly effective.

Ultimately, we considered a signature size of 18 to be ideal. As shown in Fig. 3A, an estimated performance of 91.5% accuracy, 91% sensitivity, and 92% specificity was achievable with a signature composed of the top 18 genes. We then used a consensus method to assess the stability of candidate genes across the LOOCV process. Through the LOOCV process specifying n = 18, we identified 200 different top 18 gene lists. The appearance rate of each gene among all of the top 18 gene lists was recorded. The maximum appearance rate was 100% (i.e., some genes were included in all of the top 18 gene lists during the cross-validation process). In contrast, the minimum occurrence rate was 0, indicating that those genes were never included in the top 18 gene lists during the cross-validation process. The genes were ranked according to their appearance rates. The 18 genes with the highest appearance rates were selected for the final gene expression signature. The entire training set was used to specify the parameters of the final 18-gene signature.

Table 2 shows the descriptions of the 18 selected genes and their statistical significance across control and CRC samples in the training set. Eight and 10 genes were up- and downregulated, respectively, in patients with CRC compared with controls (Fig. 3B); NEAT1 (nuclear paraspeckle assembly transcript 1) was the most significantly upregulated gene; and DUSP2 (dual-specificity phosphatase 2) was the most significantly downregulated gene. Gene Ontology and KEGG pathway analyses showed that several of the 18 genes are involved in apoptosis (e.g., GZMB and IL1B), cell adhesion (e.g., CD36 and ITGAM), the mitogen-activated protein kinase (MAPK) signaling pathway (e.g., DUSP2 and IL1B), signal transduction (e.g., SH2D2A, PDE4D, and IL1B), and some are hematopoietic cell markers (e.g., CD36, ITGAM, and IL1B; Supplementary Table S2).

Table 2.

Composition of the 18-gene signature

GeneDescriptionCytobandUniGeneP value
CD36 CD36 molecule (thrombospondin receptor) 7q11.2 Hs.120949 8.76E-07 
DHRS13 Dehydrogenase/reductase (SDR family) member 13 17q11.2 Hs.631760 2.96E-04 
DUSP2 Dual-specificity phosphatase 2 2q11 Hs.1183 1.69E-10 
FAM198B Family with sequence similarity 198, member B 4q32.1 Hs.567498 7.57E-08 
FKBP5 FK506-binding protein 5 6p21.31 Hs.407190 4.70E-05 
GLT25D2 Glycosyltransferase 25 domain containing 2 1q25 Hs.387995 2.39E-07 
GZMB Granzyme B (granzyme 2, CTL-associated serine esterase 1) 14q11.2 Hs.1051 9.68E-09 
IL1B Interleukin 1, β 2q14 Hs.126256 2.09E-04 
ITGAM Integrin, α M (complement component 3 receptor 3 subunit) 16p11.2 Hs.172631 6.27E-09 
ITPRIPL2 Inositol 1,4,5-trisphosphate receptor interacting protein-like 2 16p12.3 Hs.530899 7.72E-05 
MYBL1 v-myb Myeloblastosis viral oncogene homolog (avian)-like 1 8q22 Hs.445898 2.42E-09 
NEAT1 Nuclear paraspeckle assembly transcript 1 (nonprotein coding) 11q13.1 Hs.523789 3.48E-10 
NUDT16 Nudix (nucleoside diphosphate linked moiety X)-type motif 16 3q22.1 Hs.282050 1.08E-05 
P2RY10 Purinergic receptor P2Y, G-protein coupled, 10 Xq21.1 Hs.296433 8.54E-09 
PDE4D Phosphodiesterase 4D, cAMP-specific 5q12 Hs.117545 4.32E-08 
PDZK1IP1 PDZK1-interacting protein 1 1p33 Hs.431099 4.02E-05 
SH2D2A SH2 domain containing 2A 1q21 Hs.103527 3.01E-08 
VSIG10 V-set and immunoglobulin domain containing 10 12q24.23 Hs.187624 1.60E-06 
GeneDescriptionCytobandUniGeneP value
CD36 CD36 molecule (thrombospondin receptor) 7q11.2 Hs.120949 8.76E-07 
DHRS13 Dehydrogenase/reductase (SDR family) member 13 17q11.2 Hs.631760 2.96E-04 
DUSP2 Dual-specificity phosphatase 2 2q11 Hs.1183 1.69E-10 
FAM198B Family with sequence similarity 198, member B 4q32.1 Hs.567498 7.57E-08 
FKBP5 FK506-binding protein 5 6p21.31 Hs.407190 4.70E-05 
GLT25D2 Glycosyltransferase 25 domain containing 2 1q25 Hs.387995 2.39E-07 
GZMB Granzyme B (granzyme 2, CTL-associated serine esterase 1) 14q11.2 Hs.1051 9.68E-09 
IL1B Interleukin 1, β 2q14 Hs.126256 2.09E-04 
ITGAM Integrin, α M (complement component 3 receptor 3 subunit) 16p11.2 Hs.172631 6.27E-09 
ITPRIPL2 Inositol 1,4,5-trisphosphate receptor interacting protein-like 2 16p12.3 Hs.530899 7.72E-05 
MYBL1 v-myb Myeloblastosis viral oncogene homolog (avian)-like 1 8q22 Hs.445898 2.42E-09 
NEAT1 Nuclear paraspeckle assembly transcript 1 (nonprotein coding) 11q13.1 Hs.523789 3.48E-10 
NUDT16 Nudix (nucleoside diphosphate linked moiety X)-type motif 16 3q22.1 Hs.282050 1.08E-05 
P2RY10 Purinergic receptor P2Y, G-protein coupled, 10 Xq21.1 Hs.296433 8.54E-09 
PDE4D Phosphodiesterase 4D, cAMP-specific 5q12 Hs.117545 4.32E-08 
PDZK1IP1 PDZK1-interacting protein 1 1p33 Hs.431099 4.02E-05 
SH2D2A SH2 domain containing 2A 1q21 Hs.103527 3.01E-08 
VSIG10 V-set and immunoglobulin domain containing 10 12q24.23 Hs.187624 1.60E-06 

Validation of the 18-gene signature in an independent test set

The 18-gene signature was then applied to an independent test set of 87 patients with CRC and 73 controls. These samples had been excluded from the training set and were not used in the development of the 18-gene signature. For each sample, a probability score was calculated and a threshold of 50% was used to classify samples. Samples with a probability score less than 50% were classified as controls, whereas samples with a probability score more than 50% were classified as CRC (Fig. 4). For the controls, 64 samples were correctly classified and 9 samples were misclassified. For the patients with CRC, 73 samples were correctly classified and 14 samples were misclassified. In the test set, the 18-gene signature had 85.6% accuracy (95% CI, 0.79–0.90), 83.9% sensitivity (95% CI, 0.74–0.90), and 87.7% specificity (95% CI, 0.77–0.94). When the sensitivity was plotted against the specificity in a receiver operating characteristic curve, the area under the curve (AUC) was 0.94 (95% CI, 0.91–0.98).

Figure 4.

Validation of the 18-gene signature in the test set. A, for each sample, the probability score was calculated. The dashed line at 50% indicates the CRC versus control decision threshold. A sample was classified as CRC if the probability score was more than 50%. For controls, 64 samples were correctly classified, and 9 samples were misclassified. For patients with cancer, 73 of 87 samples were correctly classified. Of those misclassified patients with cancer, 4 were stage I, 4 were stage II, 5 were stage III, and 1 was stage IV. B, the area under the receiver operating characteristics curve reached 0.94, and the 95% CI was 0.91 to 0.98.

Figure 4.

Validation of the 18-gene signature in the test set. A, for each sample, the probability score was calculated. The dashed line at 50% indicates the CRC versus control decision threshold. A sample was classified as CRC if the probability score was more than 50%. For controls, 64 samples were correctly classified, and 9 samples were misclassified. For patients with cancer, 73 of 87 samples were correctly classified. Of those misclassified patients with cancer, 4 were stage I, 4 were stage II, 5 were stage III, and 1 was stage IV. B, the area under the receiver operating characteristics curve reached 0.94, and the 95% CI was 0.91 to 0.98.

Close modal

Of the 9 misclassified controls, 4 were female and 5 were male; 3 were over 60 years old. Of the 14 misclassified patients with cancer, 5 were female and 9 were male; 4 were over 60 years old. In addition, 4 of 14 misclassified patients with cancer were stage I, 4 were stage II, 5 were stage III, and 1 was stage IV; 2 had colon cancer and 12 had rectal cancer. The results obtained from Fisher exact test suggested that clinical and pathologic variables, such as age, gender, tumor site, and stage, did not obviously affect the prediction outcomes (Supplementary Table S3).

The early detection of CRC is crucial for successful treatment and patient survival. However, the lack of compliance remains the greatest challenge currently limiting CRC screening effectiveness. In this study, we aimed to identify and validate a blood-based gene signature that could distinguish patients with CRC from controls with high accuracy. Our work followed Biomarker Development for Early Detection of Cancer guidelines (28). We first conducted a microarray study to select genes that were significantly differentially expressed between the controls and patients with CRC. The selected genes were then transferred to a qRT-PCR platform. The qRT-PCR study was conducted for signature identification and validation using 2 independent cohorts. We presented a blood-based 18-gene signature that can be used as a biomarker to discriminate between patients with CRC and controls with a sensitivity and specificity of 84% and 88%, respectively. Fisher exact test showed no association between clinical variables (age, gender, tumor sites, and stages) and the prediction outcomes assigned to each sample, suggesting that the 18-gene signature conducted equally well for different categories of samples.

An understanding of the function of the genes comprising the signature could provide mechanistic insight into the diagnostic effect of this gene panel. Functional annotations revealed that many of the selected genes were related to immune function. GZMB, which was significantly downregulated in patients with CRC, is crucial for the rapid induction of target cell apoptosis by CTL and natural killer cells in the cell-mediated immune response (29). IL1B, an important mediator of the inflammatory response, is involved in different mechanisms leading to tumorigenesis via tumor-associated inflammation and neovascularization (30). IL1B has been shown to upregulate COX2 expression in human CRC cells (31), which may contribute to the growth and metastatic potential of CRC by increasing the expression of the antiapoptotic factor BCL-2 and upregulating specific angiogenic factors (32, 33). Furthermore, polymorphisms in IL1B have been associated with tumor recurrence in stage II colon cancer (34). The expression of SH2D2A is limited in immune system tissues, particularly activated T cells. SH2D2A has been reported as a positive regulator of proximal T-cell receptor signal transduction (35). CD36 is expressed by various types of cells that are associated with the blood and the immune system. High CD36 expression is related to decreased stromal vascularization and is a predictor of good prognosis in colon cancer (36). A polymorphism in CD36 (A52C) has been associated with an increased risk for CRC (37).

In addition, several of these genes have been associated with other human carcinomas. In lymphoma, MYBL1 was shown to activate the BCL2 P2 promoter through a Cdx-binding site, promoting resistance to apoptosis (38). PDE4D was reported to be overexpressed in human prostate cancers and associated with increased tumor growth and cell migration (39). PDE4D is also expressed in lung cancer, interacting with hypoxia-inducible factor (HIF) signaling and promoting lung cancer progression (40). PDZK1IP1 was shown to be overexpressed in a variety of human cancers in the kidney, colon, lung, and breast (41). The overexpression of the PDZK1IP1 protein was correlated with tumor progression in prostate and ovarian carcinomas (42).

Interestingly, a long noncoding RNA, NEAT1, was the most significantly upregulated gene in our analysis. Recent studies have shown that NEAT1 plays an essential role in the assembly and architecture of nuclear paraspeckles and colocalizes with the paraspeckle-associated proteins p54nrb, PSP1, and PSF (43–45). NEAT1 may therefore play an important role in regulating gene expression by governing the nuclear export of mRNAs. Our study is the first to report the deregulation of DHRS13, FAM198B, GLT25D2, ITPRIPL2, NUDT16, P2RY10, and VSIG10 mRNA expression in the peripheral blood in association with colorectal carcinoma.

Recently, several studies have suggested that the gene expression signatures identified in peripheral blood are likely not conventional tumor-derived cancer biomarkers but rather reflect subtle alterations in blood gene expression serving as a systemic immune response to tumorigenesis (8, 46–49). Our results are consistent with these findings as evidenced by the fact that the signature was enriched in genes related to immune functions. Recent work has elucidated the role of distinct immune cells, cytokines, and other immune mediators in virtually all steps of colorectal tumorigenesis, including initiation, promotion, progression, and metastasis (50). Although we cannot entirely exclude the fact that metastatic tumor cells or circulating cell-free tumor nucleic acids can affect the gene expression profiles of peripheral blood, we postulate that the distinct gene expression observed in controls and patients with CRC is most likely attributed to the interactions between the immune system and tumors. On the basis of these results, future research is needed to understand the mechanistic relationship and the biologic meaning of this complex blood gene expression signature in CRC.

Although the 18-gene signature could be developed as an in vitro blood test for general population screening, the initial clinical use of our biomarkers is more likely to serve as a complementary test to existing screening methods. In clinical practice, only 30% of high-risk individuals with FOBT-positive results eventually undergo colonoscopy. Our biomarker may therefore provide an additional risk assessment for noncompliant populations. Because a higher probability score by the 18-gene signature increases the likelihood that a patient has cancer, the probability score may provide clinically actionable risk information. For patients with a probability score less than 50%, patients may decide by themselves whether to undergo colonoscopy. For those patients who resist undergoing colonoscopy, repeat blood tests would be recommended at intervals consistent with practice guidelines, for example, annually. For patients with a probability score more than 50%, they should be strongly recommended to undergo colonoscopy for confirmation. The combination of blood biomarkers with existing screening methods can provide several major advantages. It addresses the greatest challenge currently limiting CRC screening effectiveness, namely, the lack of compliance. Blood biomarkers may have valuable use for reaching the segment of the screening population that is resistant to the currently recommended methods as well as providing screening to underserved patients with limited access to endoscopy centers. In addition, blood biomarkers may help to minimize false-positive FOBTs, and thus reduce the number and cost of colonoscopies.

Considerable research effort continues for the development of an accurate, reliable, and minimally invasive blood test for the detection of CRC. Han and colleagues and Marshall and colleagues have identified a 5- and 7-gene signatures, respectively, by analyzing gene expression profiles in whole-blood samples from patients with CRC and controls using qRT-PCR assays (7, 8). Although similar approaches were used for biomarker identification, the reported 2 signatures share no genes in common with our 18-gene signature. The absence of concordant genes could be related to many different issues, including differences in the study populations, the gene quantification technologies, and the statistical approaches used to generate the gene signatures, highlighting the need for extensive validation before the clinical implementation of these promising biomarkers. Our results represent an encouraging primary step, but several issues remain to be addressed. Additional external validation studies are being conducted to establish a standard testing protocol and to confirm the signature accuracy. The use of the 18-gene signature for detecting precancerous lesions, such as polyps and adenomas, also needs to be evaluated. Finally, a large prospective study in the target screening population is required to fully determine the clinical performance of the blood-based gene signature in comparison to colonoscopy and to assess the practical feasibility of implementing the blood test in a screening program.

In conclusion, our study describes the development and validation of a blood-based 18-gene signature that differentiates patients with CRC from controls with a high degree of accuracy in a large number of participants. Our results open an avenue for the further development of blood-based gene expression biomarkers for the diagnosis and early detection of CRC.

Q. Xu, F. Wu, and X. Meng are employed as Senior research scientist, Technician, and Asia Pacific Scientific Director, respectively, at bioMérieux (Shanghai) Co., Ltd. No potential conflicts of interest were disclosed by the other authors.

Conception and design: Y. Xu, Q. Xu, L. Yang, G. Cai, X. Meng, S. Cai, X. Du

Development of methodology: Y. Xu, Q. Xu, L. Yang, X. Ye

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): Y. Xu, L. Yang, X. Ye, F. Liu, S. Ni, C. Tan, S. Cai

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): Q. Xu

Writing, review, and/or revision of the manuscript: Y. Xu, Q. Xu, L. Yang, X. Meng, X. Du

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): Y. Xu, X. Ye, F. Liu, F. Wu, G. Cai, S. Cai

Study supervision: Y. Xu, G. Cai, X. Meng, S. Cai, X. Du

The authors thank Wencui Huang at the Fudan University Shanghai Cancer Center and Ying Jin, Ling Zhang, and Aijun Ye at the Shanghai Qibao Community Clinic for their excellent work in blood sample collection.

This study was supported by Grants from bioMérieux (Shanghai) Co., Ltd. This study was also supported by the Chinese National Clinical Key Discipline (2011–2012) and the Shanghai Science and Technology Commission of Shanghai Municipality (no. 10DJ1400500).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Tenesa
A
,
Dunlop
MG
. 
New insights into the aetiology of colorectal cancer from genome-wide association studies
.
Nat Rev Genet
2009
;
10
:
353
8
.
2.
Ferlay
J
,
Shin
HR
,
Bray
F
,
Forman
D
,
Mathers
C
,
Parkin
DM
. 
Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008
.
Int J Cancer
2010
;
127
:
2893
917
.
3.
Walsh
JM
,
Terdiman
JP
. 
Colorectal cancer screening: scientific review
.
JAMA
2003
;
289
:
1288
96
.
4.
Grutzmann
R
,
Molnar
B
,
Pilarsky
C
,
Habermann
JK
,
Schlag
PM
,
Saeger
HD
, et al
Sensitive detection of colorectal cancer in peripheral blood by septin 9 DNA methylation assay
.
PLoS ONE
2008
;
3
:
e3759
.
5.
DeVos
T
,
Tetzner
R
,
Model
F
,
Weiss
G
,
Schuster
M
,
Distler
J
, et al
Circulating methylated SEPT9 DNA in plasma is a biomarker for colorectal cancer
.
Clin Chem
2009
;
55
:
1337
46
.
6.
Lofton-Day
C
. 
Opportunities and limitations of blood-based CRC screening tests
.
Pract Gastroenterol
2012
;
XXXVI
:
29
34
.
7.
Han
M
,
Liew
CT
,
Zhang
HW
,
Chao
S
,
Zheng
R
,
Yip
KT
, et al
Novel blood-based, five-gene biomarker set for the detection of colorectal cancer
.
Clin Cancer Res
2008
;
14
:
455
60
.
8.
Marshall
KW
,
Mohr
S
,
Khettabi
FE
,
Nossova
N
,
Chao
S
,
Bao
W
, et al
A blood-based biomarker panel for stratifying current risk for colorectal cancer
.
Int J Cancer
2010
;
126
:
1177
86
.
9.
Rosenthal
A
,
Mayr
T
,
Koeppen
H
,
Musikowski
R
,
Berendt
J
,
Goebel
U
, et al
Detector-c 2.0: a highly accurate blood-based IVD test for early detection of colorectal cancer with sensitivity and specificity over 90% [abstract]
.
J Clin Oncol
30, 2012
(
suppl 4; abstr 393
).
10.
Tusher
VG
,
Tibshirani
R
,
Chu
G
. 
Significance analysis of microarrays applied to the ionizing radiation response
.
Proc Natl Acad Sci U S A
2001
;
98
:
5116
21
.
11.
Stamova
BS
,
Apperson
M
,
Walker
WL
,
Tian
Y
,
Xu
H
,
Adamczy
P
, et al
Identification and validation of suitable endogenous reference genes for gene expression studies in human peripheral blood
.
BMC Med Genomics
2009
;
2
:
49
.
12.
Vandesompele
J
,
De Preter
K
,
Pattyn
F
,
Poppe
B
,
Van Roy
N
,
De Paepe
A
, et al
Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes
.
Genome Biol
2002
;
3
:
RESEARCH0034
.
13.
Andersen
CL
,
Jensen
JL
,
Orntoft
TF
. 
Normalization of real-time quantitative reverse transcription-PCR data: a model-based variance estimation approach to identify genes suited for normalization, applied to bladder and colon cancer data sets
.
Cancer Res
2004
;
64
:
5245
50
.
14.
Pfaffl
MW
,
Tichopad
A
,
Prgomet
C
,
Neuvians
TP
. 
Determination of stable housekeeping genes, differentially regulated target genes and sample integrity: BestKeeper–Excel-based tool using pair-wise correlations
.
Biotechnol Lett
2004
;
26
:
509
15
.
15.
Silver
N
,
Best
S
,
Jiang
J
,
Thein
SL
. 
Selection of housekeeping genes for gene expression studies in human reticulocytes using real-time PCR
.
BMC Mol Biol
2006
;
7
:
33
.
16.
Chen
D
,
Pan
X
,
Xiao
P
,
Farwell
MA
,
Zhang
B
. 
Evaluation and identification of reliable reference genes for pharmacogenomics, toxicogenomics, and small RNA expression analysis
.
J Cell Physiol
2011
;
226
:
2469
77
.
17.
Ihaka
R
,
Gentleman
R
. 
R: a language for data analysis and graphics
.
J Comput Graph Stat
1996
;
5
:
299
314
.
18.
Reimers
M
,
Carey
VJ
. 
Bioconductor: an open source framework for bioinformatics and computational biology
.
Methods Enzymol
2006
;
411
:
119
34
.
19.
Irizarry
RA
,
Hobbs
B
,
Collin
F
,
Beazer-Barclay
YD
,
Antonellis
KJ
,
Scherf
U
, et al
Exploration, normalization, and summaries of high density oligonucleotide array probe level data
.
Biostatistics
2003
;
4
:
249
64
.
20.
Maglott
D
,
Ostell
J
,
Pruitt
KD
,
Tatusova
T
. 
Entrez gene: gene-centered information at NCBI
.
Nucleic Acids Res
2011
;
39
:
D52
7
.
21.
Livak
KJ
,
Schmittgen
TD
. 
Analysis of relative gene expression data using real-time quantitative PCR and the 2(-delta delta C(T)) method
.
Methods
2001
;
25
:
402
8
.
22.
Ding
C
,
Peng
H
. 
Minimum redundancy feature selection from microarray gene expression data
.
J Bioinform Comput Biol
2005
;
3
:
185
205
.
23.
Chang
C
,
Lin
C
. 
LIBSVM: a library for support vector machines
.
ACM Trans Intel Syst Technol
2011
;
2
:
21
7
.
24.
Blake
JA
,
Harris
MA
. 
The Gene Ontology (GO) project: structured vocabularies for molecular biology and their application to genome and expression analysis
.
Curr Protoc Bioinformatics
2008
;
Chapter 7: Unit 7.2
.
25.
Kanehisa
M
,
Goto
S
,
Sato
Y
,
Furumichi
M
,
Tanabe
M
. 
KEGG for integration and interpretation of large-scale molecular data sets
.
Nucleic Acids Res
2012
;
40
:
D109
14
.
26.
Tabas-Madrid
D
,
Nogales-Cadenas
R
,
Pascual-Montano
A
. 
GeneCodis3: a non-redundant and modular enrichment analysis tool for functional genomics
.
Nucleic Acids Res
2012
;
40
:
W478
83
.
27.
Chaussabel
D
,
Quinn
C
,
Shen
J
,
Patel
P
,
Glaser
C
,
Baldwin
N
, et al
A modular analysis framework for blood genomics studies: application to systemic lupus erythematosus
.
Immunity
2008
;
29
:
150
64
.
28.
Pepe
MS
,
Etzioni
R
,
Feng
Z
,
Potter
JD
,
Thompson
ML
,
Thornquist
M
, et al
Phases of biomarker development for early detection of cancer
.
J Natl Cancer Inst
2001
;
93
:
1054
61
.
29.
Trapani
JA
,
Sutton
VR
. 
Granzyme B: pro-apoptotic, antiviral and antitumor functions
.
Curr Opin Immunol
2003
;
15
:
533
43
.
30.
Aggarwal
BB
,
Shishodia
S
,
Sandur
SK
,
Pandey
MK
,
Sethi
G
. 
Inflammation and cancer: how hot is the link?
Biochem Pharmacol
2006
;
72
:
1605
21
.
31.
Liu
W
,
Reinmuth
N
,
Stoeltzing
O
,
Parikh
AA
,
Tellez
C
,
Williams
S
, et al
Cyclooxygenase-2 is up-regulated by interleukin-1 beta in human colorectal cancer cells via multiple signaling pathways
.
Cancer Res
2003
;
63
:
3632
6
.
32.
Tsujii
M
,
Kawano
S
,
Tsuji
S
,
Sawaoka
H
,
Hori
M
,
DuBois
RN
. 
Cyclooxygenase regulates angiogenesis induced by colon cancer cells
.
Cell
1998
;
93
:
705
16
.
33.
Tsujii
M
,
DuBois
RN
. 
Alterations in cellular adhesion and apoptosis in epithelial cells overexpressing prostaglandin endoperoxide synthase 2
.
Cell
1995
;
83
:
493
501
.
34.
Lurje
G
,
Hendifar
AE
,
Schultheis
AM
,
Pohl
A
,
Husain
H
,
Yang
D
, et al
Polymorphisms in interleukin 1 beta and interleukin 1 receptor antagonist associated with tumor recurrence in stage II colon cancer
.
Pharmacogenet Genomics
2009
;
19
:
95
102
.
35.
Marti
F
,
Garcia
GG
,
Lapinski
PE
,
MacGregor
JN
,
King
PD
. 
Essential role of the T cell-specific adapter protein in the activation of LCK in peripheral T cells
.
J Exp Med
2006
;
203
:
281
7
.
36.
Tsuchida
T
,
Kijima
H
,
Tokunaga
T
,
Oshika
Y
,
Hatanaka
H
,
Fukushima
Y
, et al
Expression of the thrombospondin 1 receptor CD36 is correlated with decreased stromal vascularisation in colon cancer
.
Int J Oncol
1999
;
14
:
47
51
.
37.
Kuriki
K
,
Hamajima
N
,
Chiba
H
,
Kanemitsu
Y
,
Hirai
T
,
Kato
T
, et al
Increased risk of colorectal cancer due to interactions between meat consumption and the CD36 gene A52C polymorphism among Japanese
.
Nutr Cancer
2005
;
51
:
170
7
.
38.
Heckman
CA
,
Mehew
JW
,
Ying
GG
,
Introna
M
,
Golay
J
,
Boxer
LM
. 
A-Myb up-regulates Bcl-2 through a Cdx binding site in t(14;18) lymphoma cells
.
J Biol Chem
2000
;
275
:
6499
508
.
39.
Rahrmann
EP
,
Collier
LS
,
Knutson
TP
,
Doyal
ME
,
Kuslak
SL
,
Green
LE
, et al
Identification of PDE4D as a proliferation promoting factor in prostate cancer using a Sleeping Beauty transposon-based somatic mutagenesis screen
.
Cancer Res
2009
;
69
:
4388
97
.
40.
Pullamsetti
SS
,
Banat
GA
,
Schmall
A
,
Szibor
M
,
Pomagruk
D
,
Hanze
J
, et al
Phosphodiesterase-4 promotes proliferation and angiogenesis of lung cancer by crosstalk with HIF
.
Oncogene
2013
;
32
:
1121
34
.
41.
Kocher
O
,
Cheresh
P
,
Lee
SW
. 
Identification and partial characterization of a novel membrane-associated protein (MAP17) up-regulated in human carcinomas and modulating cell replication and tumor growth
.
Am J Pathol
1996
;
149
:
493
500
.
42.
Guijarro
MV
,
Leal
JF
,
Fominaya
J
,
Blanco-Aparicio
C
,
Alonso
S
,
Lleonart
M
, et al
MAP17 overexpression is a common characteristic of carcinomas
.
Carcinogenesis
2007
;
28
:
1646
52
.
43.
Clemson
CM
,
Hutchinson
JN
,
Sara
SA
,
Ensminger
AW
,
Fox
AH
,
Chess
A
, et al
An architectural role for a nuclear noncoding RNA: NEAT1 RNA is essential for the structure of paraspeckles
.
Mol Cell
2009
;
33
:
717
26
.
44.
Souquere
S
,
Beauclair
G
,
Harper
F
,
Fox
A
,
Pierron
G
. 
Highly ordered spatial organization of the structural long noncoding NEAT1 RNAs within paraspeckle nuclear bodies
.
Mol Biol Cell
2010
;
21
:
4020
7
.
45.
Murthy
UM
,
Rangarajan
PN
. 
Identification of protein interaction regions of VINC/NEAT1/Men epsilon RNA
.
Febs Lett
2010
;
584
:
1531
5
.
46.
Showe
MK
,
Vachani
A
,
Kossenkov
AV
,
Yousef
M
,
Nichols
C
,
Nikonova
EV
, et al
Gene expression profiles in peripheral blood mononuclear cells can distinguish patients with non–small cell lung cancer from patients with nonmalignant lung disease
.
Cancer Res
2009
;
69
:
9202
10
.
47.
Zander
T
,
Hofmann
A
,
Staratschek-Jox
A
,
Classen
S
,
Debey-Pascher
S
,
Maisel
D
, et al
Blood-based gene expression signatures in non-small cell lung cancer
.
Clin Cancer Res
2011
;
17
:
3360
7
.
48.
Olmos
D
,
Brewer
D
,
Clark
J
,
Danila
DC
,
Parker
C
,
Attard
G
, et al
Prognostic value of blood mRNA expression signatures in castration-resistant prostate cancer: a prospective, two-stage study
.
Lancet Oncol
2012
;
13
:
1114
24
.
49.
Ross
RW
,
Galsky
MD
,
Scher
HI
,
Magidson
J
,
Wassmann
K
,
Lee
GS
, et al
A whole-blood RNA transcript-based prognostic model in men with castration-resistant prostate cancer: a prospective study
.
Lancet Oncol
2012
;
13
:
1105
13
.
50.
Terzic
J
,
Grivennikov
S
,
Karin
E
,
Karin
M
. 
Inflammation and colon cancer
.
Gastroenterology
2010
;
138
:
2101
14
.