Longitudinal blood collections from cohort studies provide the means to search for proteins associated with disease before clinical diagnosis. We investigated plasma samples from the Women's Health Initiative (WHI) cohort to determine quantitative differences in plasma proteins between subjects subsequently diagnosed with colorectal cancer (CRC) and matched controls that remained cancer-free during the period of follow-up. Proteomic analysis of WHI samples collected before diagnosis of CRC resulted in the identification of six proteins with significantly (P < 0.05) elevated concentrations in cases compared with controls. Proteomic analysis of two CRC cell lines showed that five of the six proteins were produced by cancer cells. Microtubule-associated protein RP/EB family member 1 (MAPRE1), insulin-like growth factor–binding protein 2 (IGFBP2), leucine-rich alpha-2-glycoprotein (LRG1), and carcinoembryonic antigen (CEA) were individually assayed by enzyme linked immunosorbent assay (ELISA) in 58 pairs of newly diagnosed CRC samples and controls and yielded significant elevations (P < 0.05) among cases relative to controls. A combination of these four markers resulted in a receiver operating characteristics curve with an area under the curve value of 0.841 and 57% sensitivity at 95% specificity. This combination rule was tested in an independent set of WHI samples collected within 7 months before diagnosis from cases and matched controls resulting in 41% sensitivity at 95% specificity. A panel consisting of CEA, MAPRE1, IGFBP2, and LRG1 has predictive value in prediagnostic CRC plasmas. Cancer Prev Res; 5(4); 655–64. ©2012 AACR.

Current screening methods for colorectal cancer (CRC) have had an impact on mortality associated with this disease (1). There is a 30% to 40% drop in risk of developing CRC following a negative result from a colonoscopy and as much as a 50% reduction in incidence in the portion of the bowel examined by sigmoidoscopy or colonoscopy (2). Even with these decreases in risk and incidence, it is estimated that about 60% of subjects older than 50 years in the United States are not screened at recommended intervals (3, 4). In the case of colonoscopies, even when subjects are referred by their physician there is only an approximately 50% rate of adherence (5).

Plasma levels of carcinoembryonic antigen (CEA) are currently used as a preoperative prognostic indicator for CRC, with higher levels of CEA being positively correlated with a poor prognosis (6). CEA also has use for monitoring therapy in advanced disease and for patient surveillance following curative resection (7, 8). However, it lacks the sensitivity and specificity to be used as a diagnostic marker for CRC (6), hence the need for additional markers that could supplant it or augment its performance.

The ease of sampling plasma makes it a logical choice for the development of a panel of proteins that inform about risk of developing CRC. However, the plasma proteome is extremely complex and is composed of proteins ranging in concentration over at least 9 orders of magnitude. Recent work has shown that low-abundance plasma proteins may be identified with high confidence following extensive plasma fractionation (9). High-abundant proteins interfere with detection and quantification of less-abundant proteins, necessitating their removal before mass spectrometric (MS) analysis, typically through immunodepletion. After removal of the most highly abundant proteins, samples still require extensive fractionation by anion exchange and/or reverse-phase chromatography to decomplex the sample to achieve adequate sampling of the plasma proteome.

Guidelines for the design of biomarker discovery and validation studies have been recommended (10). Retrospective longitudinal repository studies are used to evaluate biomarkers for their capacity to detect preclinical disease as a function of time before clinical diagnosis, as well as other sample characteristics that may define clinical applications. This is done through analysis of the most promising markers and developing algorithms for screening positivity based on a combination of markers. The use of specimens collected before diagnosis through longitudinal cohort studies meets prospective-specimen-collection, retrospective-blinded-evaluation (PRoBE) design requirements (11), reduces bias, and allows identification of proteins that may have value for early detection and risk assessment. Using samples from the Women's Health Initiative (WHI) cohort, an intact protein analysis system (IPAS) approach that allows quantitative analysis of proteins over 6 to 7 orders of magnitude of abundance (9, 12–14) was applied to plasmas from 90 participants who were subsequently diagnosed with colon cancer within 18 months following blood draw and to 90 matched controls from the same cohort. Further testing of a protein subset was conducted in samples from Early Detection Research Network (EDRN) collected at the time of diagnosis which included both male and female subjects. A panel established in the newly diagnosed cohort was subsequently shown to have predictive value in an independent set of prediagnostic CRC plasmas from the WHI cohort.

Study population

The sample population used in the discovery phase consisted of plasmas from 90 women who were diagnosed with CRC within 18 months following a blood draw that occurred in the third year of participation in the WHI Observational Study. These cases were individually matched on the basis of age (±2 years), race/ethnicity, and baseline blood draw (±2 months) to a randomly selected control without a history of cancer diagnosis (Table 1).

Table 1.

Prediagnostic WHI cohort sample characteristics

Discovery setValidation set
CaseControlCaseControl
Number 90 90 32 32 
Average age, y 64.9 64.9 68.1 67.8 
Stage I 8 (9%) — 4 (13%) — 
Stage II 29 (32%) — 13 (41%) — 
Stage III 37 (41%) — 12 (38%) — 
Stage IV 16 (18%) — 3 (9%) — 
Caucasian 75 (83%) 75 (83%) 24 (75%) 24 (75%) 
African-American 10 (11%) 10 (11%) 7 (22%) 7 (22%) 
Asian 3 (3%) 3 (3%)   
Hispanic 1 (1%) 1 (1%)   
Other  1 (1%) 1 (3%) 1 (3%) 
Missing 1 (1%)    
Average days to diagnosis 245 — 109 — 
Discovery setValidation set
CaseControlCaseControl
Number 90 90 32 32 
Average age, y 64.9 64.9 68.1 67.8 
Stage I 8 (9%) — 4 (13%) — 
Stage II 29 (32%) — 13 (41%) — 
Stage III 37 (41%) — 12 (38%) — 
Stage IV 16 (18%) — 3 (9%) — 
Caucasian 75 (83%) 75 (83%) 24 (75%) 24 (75%) 
African-American 10 (11%) 10 (11%) 7 (22%) 7 (22%) 
Asian 3 (3%) 3 (3%)   
Hispanic 1 (1%) 1 (1%)   
Other  1 (1%) 1 (3%) 1 (3%) 
Missing 1 (1%)    
Average days to diagnosis 245 — 109 — 

Plasma from 58 newly diagnosed male and female patients with CRC and 58 matched controls were collected through the Community Clinical Oncology Program at the University of Michigan, Ann Arbor, MI, following informed consent. Cases were individually matched on the basis of age (within 3 years) and gender.

An independent set of plasmas from 32 subjects in the WHI Observational Study who were diagnosed with CRC within 7 months following the third year blood draw and 32 matched controls was used for validation. Matching was done based on age (±4 years), race/ethnicity, baseline blood draw (±4 months), body mass index, hormone therapy use, and a negative history for cancer (Table 1).

Proteomic analysis

IPAS.

Nine large-scale proteomic experiments were carried out on pools of plasmas from 10 cases and 10 controls as previously described (refs. 12, 15; Supplementary Fig. S1). In 4 experiments, the pool of cases was labeled with light acrylamide, and its matched control pool was labeled with heavy 1,2,3-13C-acrylamide isotope. The labeling was switched in the other case–control pools. In each experiment, the pool of cases and the pool of matched controls were mixed together before further processing and mass spectrometry.

Proteins were separated by an automated online 2D-HPLC system controlled by Workstation Class-VP 7.4 (Shimadzu Corporation). Separation consisted of anion exchange chromatography followed by reverse-phase chromatography.

In-solution tryptic digestion was conducted with lyophilized aliquots from the reverse-phase (second dimension) fractionation step. Aliquots were subjected to MS shotgun analysis using an LTQ-Orbitrap (Thermo) mass spectrometer coupled with a NanoLC-1D (Eksigent). The acquired data were automatically processed by the Computational Proteomics Analysis System (CPAS; ref. 16). For the identification of proteins with false-positive error rate less than 5%, liquid chromatography/tandem mass spectrometry (LC/MS-MS) spectra of the samples were subjected to tryptic searches against the human IPI database (v.3.13) using X! Tandem (17). Search results were then analyzed by PeptideProphet (18) and ProteinProphet (19) programs. Quantitative protein analysis was based on differential labeling of cysteine residues with acrylamide isotopes. Peptide isotopic ratios were plotted in logarithmic scale in a histogram, and the median of the distribution was centered at zero (Supplementary Fig. S2). All normalized peptide ratios for a specific protein were averaged to compute an overall protein ratio. Reported statistical significance of the protein quantitative information was obtained using a one-sample Student t test. False discovery rates were calculated on the basis of the distribution of P values from permutations of disease labels and the observed P values from the original data.

CRC cell line proteomic analysis.

HCT116 and SW480 were prepared according to the standard stable isotope labeling with amino acids in cell culture (SILAC) protocol as previously described (20). Secreted proteins were obtained directly from conditioned media after 48 hours of culture. Total cell extract (TCE) was obtained by sonication of about 2 × 107 cells followed by centrifugation at 20,000 × g. A surface-enriched fraction was obtained by biotinylating about 2 × 108 cells in culture plates. Proteins were extracted in a 2% NP-40 solution and subsequently isolated using NeutrAvidin.

Cell line preparations were fractioned by reverse-phase chromatography. Reverse-phase fractions from each preparation were individually digested with trypsin and grouped into 23 to 27 pools on the basis of chromatographic features. LC/MS-MS and protein identification were conducted as described above using v3.57 of the human IPI database.

ELISA-based validation

Human IGFBP2 (R&D Systems), LRG1 (IBL-America), CEA (Genway Biotech, Inc.), and MAPRE1 (USCN Life) measurements were conducted on newly diagnosed and prediagnostic plasma samples according to the manufacturer's protocol. Absorbance was measured using a SpectraMax Plus 384 and results calculated with SoftMax Pro v4.7.1 (Molecular Devices). P values were computed using a paired Mann–Whitney–Wilcoxon test on raw concentration values. ELISA measurements above and below the detection limit for assays were imputed by the maximum and minimum computable values for the assay.

Proteomic analysis of plasma from study subjects and CRC cell lines

An in-depth quantitative MS analysis of WHI plasma samples in 9 large-scale experiments yielded a total of 1,992,567 mass spectra, resulting in a total of 5,022 unique protein IDs in the International Protein Index (IPI) database (21). Quantitative data based on isotopic ratios for case versus control was obtained for 1,779 proteins. An overall P value and a geometric mean ratio for each protein across all 9 experiments were calculated. Six proteins were significantly (P < 0.05) elevated in cases compared with controls with a case-to-control ratio >1.2 (Table 2). Microtubule-associated protein RP/EB family member 1 (MAPRE1) is a cytoplasmic protein that binds to adenomatous polyposis coli (APC), a commonly mutated gene in colorectal adenocarcinoma, and functions in mitotic processes (22). Leucine-rich alpha-2-glycoprotein (LRG1) is an extracellular protein whose function is largely unknown with varied expression levels in tissues (23). A role for LRG1 in granulocyte differentiation has been suggested (24). Insulin-like growth factor–binding protein 2 (IGFBP2) is an extracellular protein that binds IGF2 and has been shown to potentially have both proliferative and antiproliferative roles in cancer (25). Enolase 1 has been identified as a central element in a disease-specific gene network in colon cancer (26). Mesencephalic astrocyte-derived neurotrophic factor (ARMET) and protein disulfide-isomerase A3 (PDIA3) belong to a family of endoplasmic reticulum stress–induced proteins which have been found to be upregulated in gastric and hepatocellular carcinomas (27, 28). Mass spectrometric analysis yielded substantial peptide coverage for all 6 proteins (Fig. 1A–F), indicating a robust identification of each full-length protein in human plasma.

Figure 1.

Tryptic peptides identified from MS experiments for IPAS experiments for (A) MAPRE1, (B) LRG1, (C) IGFBP2. (D) ENO1, (E) ARMET, and (F) PDIA3. Each line represents peptides identified or quantified in an individual IPAS. The top line in each figure represents theoretical peptides generated by tryptic digestion. Dark grey peptides indicate non-cystein–containing tryptic peptides, whereas striped peptides are cysteine-containing tryptic peptides that could be quantified from the acrylamide labeling. In the experimental data, dark grey peptides indicate the peptide was identified, but not quantified, whereas striped peptides indicate the peptide had quantification data from the acrylamide labeling.

Figure 1.

Tryptic peptides identified from MS experiments for IPAS experiments for (A) MAPRE1, (B) LRG1, (C) IGFBP2. (D) ENO1, (E) ARMET, and (F) PDIA3. Each line represents peptides identified or quantified in an individual IPAS. The top line in each figure represents theoretical peptides generated by tryptic digestion. Dark grey peptides indicate non-cystein–containing tryptic peptides, whereas striped peptides are cysteine-containing tryptic peptides that could be quantified from the acrylamide labeling. In the experimental data, dark grey peptides indicate the peptide was identified, but not quantified, whereas striped peptides indicate the peptide had quantification data from the acrylamide labeling.

Close modal
Table 2.

Proteins with significantly elevated levels in plasmas collected before diagnosis of CRC compared with matched controls from the WHI cohort that did not develop CRC during the period of follow-up

GeneDescriptionAverage case–control ratioPFDR
MAPRE1 Microtubule-associated protein RP/EB family member 1 4.52 0.019 0.763 
PDIA3 Protein disulfide-isomerase a3 1.51 0.027 0.471 
IGFBP2 Insulin-like growth factor–binding protein 2 1.20 0.027 0.420 
ENO1 Alpha-enolase 1.96 0.029 0.713 
ARMET Mesencephalic astrocyte-derived neurotrophic factor 1.64 0.043 0.417 
LRG1 Leucine-rich alpha-2-glycoprotein 1.35 0.046 0.487 
GeneDescriptionAverage case–control ratioPFDR
MAPRE1 Microtubule-associated protein RP/EB family member 1 4.52 0.019 0.763 
PDIA3 Protein disulfide-isomerase a3 1.51 0.027 0.471 
IGFBP2 Insulin-like growth factor–binding protein 2 1.20 0.027 0.420 
ENO1 Alpha-enolase 1.96 0.029 0.713 
ARMET Mesencephalic astrocyte-derived neurotrophic factor 1.64 0.043 0.417 
LRG1 Leucine-rich alpha-2-glycoprotein 1.35 0.046 0.487 

Abbreviation: FDR, false discovery rate.

To determine whether the identified proteins may have originated from tumor cells or from a host response, proteomic analysis of 2 CRC cell lines with different driver mutations was conducted using SILAC (20). HCT116 and SW480 were analyzed to assess potential differences in protein expression based on APC mutational status. MAPRE1, IGFBP2, alpha-enolase (ENO1), PDIA3, and ARMET were identified in both HCT116 and the APC-mutant SW480 (Supplementary Fig. S3a–S3e), whereas LRG1 was not identified in either of the 2 cell lines. MAPRE1, ENO1, PDIA3, and ARMET were observed in TCE, the media, and surface-enriched fractions in both cell lines. The APC-binding domain of MAPRE1 was enriched in the media and cell surface fractions. PDIA3 was enriched in the cell surface compartment with fewer peptides identified in the conditioned media. ARMET was also enriched on the cell surface of the SW480 cell line but not in HCT116. ENO1 was the most identified protein in both TCE and conditioned media. IGFBP2 was predominantly observed in the conditioned media, with few peptides identified in the TCE. These cell findings suggest that tumor cells may contribute to increased levels observed in plasma for MAPRE1, IGFBP2, ENO1, PDIA3, and ARMET.

Assays of IGFBP2, LRG1, and MAPRE1 in plasmas from newly diagnosed CRC cases

Three of these 6 proteins (MAPRE1, LRG1, and IGFBP2) had ELISA assays available for further validation studies. IGFBP2, LRG1, and MAPRE1 along with CEA were assayed in plasma from newly diagnosed CRC subjects and controls (Fig. 2). Given that the discovery studies were based on pools of cases and controls, ELISA assays of individual samples were relied upon to develop a combination rule for validation of the marker panel in an independent set of prediagnostic samples. All 4 of the assayed proteins were found to be significantly (P < 0.05) elevated by more than 1.5-fold in cases compared with controls (Table 3) in a set of 58 newly diagnosed CRC cases and 58 age-matched controls. Area under the curve values (AUC) for IGFBP2, LRG1, MAPRE1, and CEA ranged from 0.712 to 0.782 (Table 3). Linear regression analyses based on maximum likelihood estimation of raw ELISA values were conducted on all possible combinations of the 4 markers. A combination of all 4 markers (denoted “Panel”) was found to have the highest AUC of 0.841 with 59% sensitivity at 95% specificity, a 23% increase over CEA alone (Fig. 3A). Scatter plots of ELISA responses showed that levels of MAPRE1 and CEA correlated well (Supplementary Fig. S4). Similarly, IGFBP2 and LRG1 were also highly correlated, whereas MAPRE1 and LRG1 exhibited an orthogonal relationship.

Figure 2.

ELISA results in newly diagnosed samples of (A) CEA, (B) MAPRE1, (C) LRG1, and (D) IGFBP2. Each circle represents one of 58 individual cases or matched controls. ****, P < 0.0001 based on paired Wilcoxon test.

Figure 2.

ELISA results in newly diagnosed samples of (A) CEA, (B) MAPRE1, (C) LRG1, and (D) IGFBP2. Each circle represents one of 58 individual cases or matched controls. ****, P < 0.0001 based on paired Wilcoxon test.

Close modal
Figure 3.

A, receiver operating characteristics curve (ROC) analysis of a linear combination of the 4 markers compared with CEA in newly diagnosed samples. B, ROC analysis of the same linear combination of the 4 markers compared with CEA in prediagnostic samples. Coefficients for combination of the 4 markers are: CEA, 3.612e-2; IGFBP2, 7.052e-3; LRG1, 7.263e-5; and MAPRE1, 2.766e-2.

Figure 3.

A, receiver operating characteristics curve (ROC) analysis of a linear combination of the 4 markers compared with CEA in newly diagnosed samples. B, ROC analysis of the same linear combination of the 4 markers compared with CEA in prediagnostic samples. Coefficients for combination of the 4 markers are: CEA, 3.612e-2; IGFBP2, 7.052e-3; LRG1, 7.263e-5; and MAPRE1, 2.766e-2.

Close modal
Table 3.

ELISA summary for individual proteins in newly diagnosed and prediagnostic CRC samples compared with matched healthy controls

Newly diagnosed samplesPrediagnostic samples
RatioPAUCRatioPAUC
CEA 45.76 1.9E-06 0.782 (0.706–0.857) 8.42 0.0261 0.602 (0.463–0.740) 
IGFBP2 1.66 6.2E-05 0.717 (0.624–0.811) 1.27 0.0984 0.586 (0.444–0.728) 
LRG1 1.60 5.0E-05 0.712 (0.615–0.808) 1.40 8.2E-05 0.723 (0.594–0.851) 
MAPRE1 10.79 3.9E-07 0.778 (0.691–0.864) 5.30 0.0066 0.701 (0.570–0.831) 
Newly diagnosed samplesPrediagnostic samples
RatioPAUCRatioPAUC
CEA 45.76 1.9E-06 0.782 (0.706–0.857) 8.42 0.0261 0.602 (0.463–0.740) 
IGFBP2 1.66 6.2E-05 0.717 (0.624–0.811) 1.27 0.0984 0.586 (0.444–0.728) 
LRG1 1.60 5.0E-05 0.712 (0.615–0.808) 1.40 8.2E-05 0.723 (0.594–0.851) 
MAPRE1 10.79 3.9E-07 0.778 (0.691–0.864) 5.30 0.0066 0.701 (0.570–0.831) 

NOTE: Newly diagnosed samples are both male and female. Prediagnostic WHI samples are all female.

Assays of individual markers in an independent set of prediagnostic plasmas

The linear combination of CEA, IGFBP2, LRG1, and MAPRE1 that was constructed on the basis of the newly diagnosed samples was evaluated in an independent set of prediagnostic WHI plasma samples consisting of 32 CRC cases and 32 matched controls drawn within 7 months before the diagnosis of CRC. This combination rule resulted in an AUC of 0.724, with 41% sensitivity at 95% specificity (Fig. 3B), compared with 19% sensitivity at 95% specificity for CEA alone.

Furthermore, CEA, LRG1, and MAPRE1 were each significantly elevated in cases compared with controls (Table 3). IGFBP2 was not significantly elevated in the prediagnostic samples, with a mean ratio of 1.27 and P < 0.1. Individual markers had AUCs between 0.586 (IGFBP2) and 0.723 (LRG1; Table 3). Ratios for each marker were lower in the prediagnostic samples than in the newly diagnosed group.

The proteomic analysis of 9 pools from 180 plasma samples from the WHI cohort collected before diagnosis and an equal number of matched controls yielded a set of 6 proteins that were significantly upregulated in cases compared with controls. Three of these proteins, LRG1, IGFBP2, and ARMET, are known to be secreted whereas MAPRE1, PDIA3, and ENO1 are predominantly intracellular. IGFBP2, LRG1, and MAPRE1 were selected for further characterization and validation on the basis of the availability of ELISA assays. Immunologic testing of these 3 proteins along with CEA in plasmas from newly diagnosed subjects showed significant (P < 0.05) elevation of each in cases compared with controls. A linear combination of the 4 proteins yielded 59% sensitivity at 95% specificity for plasmas from newly diagnosed cases relative to controls indicative of the potential of the marker panel for improved monitoring of CRC. Addition of the 3 markers to CEA also improved performance in prediagnostic samples. Sensitivity was increased from 19% for CEA alone to 41% at 95% specificity for the panel in blood drawn within 7 months before the diagnosis of CRC. In addition, CEA, LRG1, and MAPRE1 were each significantly elevated in the prediagnostic plasmas. IGFBP2 yielded a case-to-control ratio of 1.27 before diagnosis but was not statistically significant. Prediagnostic samples separated by stage showed increased levels of CEA and MAPRE1 in stage III/IV cases compared with stage I/II cases (P = 0.068 and 0.120, respectively; Supplementary Fig. S5). For both proteins, only stage III/IV cases were significantly higher than controls. The elevation of LRG1 in cases compared to controls was more statistically significant in stage III/IV cases than in stage I/II cases. Our findings suggest that circulating plasma levels of CEA, LRG1, and MAPRE1 may all increase with tumor progression.

Extensive mass spectrometric evaluations of IGFBP2, LRG1, and MAPRE1 in other cancers and inflammatory diseases have been carried out by our group. Protein levels were on average unchanged across multiple experiments in both breast and lung cancer for each of the 3 proteins. In patients who developed coronary heart disease, MAPRE1 and IGFBP2 were decreased or unchanged in diseased individuals compared with matched controls whereas LRG1 was not quantified. CEA was not quantified in any of the mass spectrometric experiments likely due to its high degree of glycosylation.

A comprehensive proteomic analysis of an Apc Δ580 mouse model was previously conducted by our group (29). From that analysis, it was observed that circulating levels of both LRG1 and IGFBP2 were significantly (P < 0.05) elevated in tumor-bearing mice compared with controls. ENO1 and PDIA3 were also identified in the analysis of mouse plasma samples based on non-cysteine–containing peptides, thus lacking quantification, whereas no peptides from MAPRE1 or ARMET were identified, likely due to their very low abundance in plasma.

Mutation of the APC gene is considered to be one of the initiating events in the development of colorectal adenocarcinoma (30). The mutated form of APC is commonly truncated, retaining only the N-terminus, resulting in increased protein mobility and altered function (31). MAPRE1 is known to bind to APC and participate in the stabilization of microtubules through interactions with the formin mDia (32). Overexpression of MAPRE1 has been found to induce nuclear accumulation of β-catenin and activate the β-catenin/T-cell factor pathway leading to a promotion of cell growth and increase in colony formation (33, 34). Our study shows a significant elevation of circulating MAPRE1 protein in newly diagnosed and prediagnostic CRC plasma samples. Expression of MAPRE1 has been reported to be elevated in tissue from head and neck cancer (35) and to be correlated with tumor size and associated with poor differentiation in hepatocellular carcinoma tissue (36). Extensive proteomic analysis of 2 CRC cell lines, as well as Western blot analysis, resulted in the identification of MAPRE1 in conditioned media. Gene expression data from BioGPS indicated that MAPRE1 was strongly expressed in colorectal adenocarcinoma compared with most other tissues, including normal colon (37). Immunohistochemistry for MAPRE1 in colorectal tumor tissues from Human Protein Atlas (38) shows an increase in cytoplasmic staining compared with normal tissues. Our study has revealed for the first time an association between circulating levels of MAPRE1 and CRC.

Elevated plasma levels of LRG1 have previously been reported for pancreatic and ovarian cancers (39–41) but not for CRC. LRG1 is primarily expressed in the liver (37) and has been associated with acute-phase response, being induced by proinflammatory cytokines, such as interleukin 6 (IL-6; ref. 23). LRG1 was not observed in proteomic analysis of conditioned media from 2 CRC cell lines, suggesting that increased circulating levels are a response to tumor development. Elevated circulating levels have previously been associated with GVHD (14), as well as autoimmune diseases (42). LRG1 may be released from neutrophils (24, 43). LRG1 has also been associated with TGFβ signaling (44), specifically through interaction with TGFβ receptor type II (45).

CEA has long been established as a marker for CRC (46). It is a member of the immunoglobulin superfamily and has been associated with cancer dissemination (47). Because of its low sensitivity and specificity, CEA has limited use in screening or diagnosis of early-stage CRC and has primarily been assayed to determine preoperative prognosis and for disease monitoring (6, 48). Plasma levels of CEA are reduced following surgical removal of cancerous polyps (49–51). Our data suggest that CEA may have use for early detection of CRC as part of a panel of markers.

IGFBP2 has been previously investigated as a potential plasma marker for CRC and other cancers (52–54) with mostly negative findings (55, 56). In this study, plasma levels were significantly elevated in newly diagnosed patients, but not in preclinical samples, suggesting that circulating levels of IGFBP2 increase with progressive tumor development (57). Given the occurrence of IGFBP2 in the conditioned media of CRC cell lines, it is likely that an increase in tumor cell mass contributes to the observed increase in plasma levels with advanced CRC.

ENO1, PDIA3, and ARMET have all previously been investigated in various cancers but only ENO1 has been associated with colorectal tumor development (26–28). Levels of PDIA3 in human plasma were found to be elevated in hepatocellular carcinoma on the basis of an immunoassay (27) and in gastric cancer on the basis of proteomic mass spectrometric analysis (28). Our findings suggest that these 3 proteins may be elevated in the plasma of patients with CRC before the clinical diagnosis of the disease. These markers may further improve the performance of the 4-marker panel presented in this study.

Multiple steps were used to facilitate the in-depth, quantitative plasma proteomic profiling in this study: depletion of the 6 most highly abundant plasma proteins, extensive protein fractionation using reverse-phase and anion exchange chromatography, and the use of heavy and light acrylamide labels for comparison of cases and controls. Using these steps, proteins across 7 orders of magnitude and with concentrations in the picogram per milliliter range have been identified. MS-based discovery of this nature, while quantitative, does not recognize posttranslational modifications, such as glycosylation that may be cancer-related (58).

Most prior discovery studies of blood-based biomarkers for early detection have been based on analysis of specimens collected at the time of diagnosis. In contrast, our study relied on plasma samples collected before the clinical manifestation of CRC to identify and validate a panel of markers that could be useful for early detection and identification of subjects at increased risk of developing CRC. The WHI cohort samples used in discovery and validation sets consist entirely of postmenopausal women and may not be representative of the general population as a whole. Hormone therapy use, which may alter the circulating levels of some proteins, was not a factor used in matching case and control samples in this study and could have impacted the plasma proteome. However, there was no bias in this regard between cases and controls of which we are aware. However, validation data from newly diagnosed patients suggest that levels of the assayed markers are not confounded by gender (Supplementary Fig. S6) or hormone therapy.

The WHI cohort meets the requirements of phase III of biomarker development as outlined by Pepe and colleagues (10). Because discovery of markers was also done in preclinical samples, phase I and II were not applicable, which is an advantage of this study. The primary aims of phase III, to evaluate the capacity of the biomarkers to detect preclinical disease and to define criteria for a positive screening test, were addressed. CEA, MAPRE1, and LRG1 were shown to significantly differentiate preclinical cases from matched controls, whereas IGFBP2 was elevated in preclinical cases, but not significantly so. Furthermore, a linear combination of these 4 markers was established that differentiates preclinical cases from controls with 41% sensitivity at 95% specificity.

Mass spectrometric analysis of preclinical CRC compared with matched controls yielded a set of elevated proteins that were further validated by ELISA. Three of these proteins in conjunction with CEA show promise as a preclinical test for CRC. Further improvements in sensitivity and specificity based on inclusion of additional markers may ultimately lead to a blood-based test to aid in screening for CRC.

No potential conflicts of interests were disclosed.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Hawk
ET
,
Levin
B
. 
Colorectal cancer prevention
.
J Clin Oncol
2005
;
23
:
378
91
.
2.
Singh
H
,
Turner
D
,
Xue
L
,
Targownik
LE
,
Bernstein
CN
. 
Risk of developing colorectal cancer following a negative colonoscopy examination - Evidence for a 10-year interval between colonoscopies
.
JAMA
2006
;
295
:
2366
73
.
3.
Seeff
LC
,
Manninen
DL
,
Dong
FB
,
Chattopadhyay
SK
,
Nadel
MR
,
Tangka
FKL
, et al
Is there endoscopic capacity to provide colorectal cancer screening to the unscreened population in the United States?
Gastroenterology
2004
;
127
:
1661
9
.
4.
Wee
CC
,
McCarthy
EP
,
Phillips
RS
. 
Factors associated with colon cancer screening: the role of patient factors and physician counseling
.
Prev Med
2005
;
41
:
23
9
.
5.
Denberg
TD
,
Melhado
TV
,
Coombes
JM
,
Beaty
BL
,
Berman
K
,
Byers
TE
, et al
Predictors of nonadherence to screening colonoscopy
.
J Gen Intern Med
2005
;
20
:
989
95
.
6.
Duffy
MJ
. 
Carcinoembryonic antigen as a marker for colorectal cancer: is it clinically useful?
Clin Chem
2001
;
47
:
624
30
.
7.
Duffy
MJ
,
van Dalen
A
,
Haglund
C
,
Hansson
L
,
Holinski-Feder
E
,
Klapdor
R
, et al
Tumour markers in colorectal cancer: European Group on Tumour Markers (EGTM) guidelines for clinical use
.
Eur J Cancer
2007
;
43
:
1348
60
.
8.
Ludwig
JA
,
Weinstein
JN
. 
Biomarkers in cancer staging, prognosis and treatment selection
.
Nat Rev Cancer
2005
;
5
:
845
56
.
9.
Faca
V
,
Pitteri
SJ
,
Newcomb
L
,
Glukhova
V
,
Phanstiel
D
,
Krasnoselsky
A
, et al
Contribution of protein fractionation to depth of analysis of the serum and plasma proteomes
.
J Proteome Res
2007
;
6
:
3558
65
.
10.
Pepe
MS
,
Etzioni
R
,
Feng
ZD
,
Potter
JD
,
Thompson
ML
,
Thornquist
M
, et al
Phases of biomarker development for early detection of cancer
.
J Natl Cancer Inst
2001
;
93
:
1054
61
.
11.
Pepe
MS
,
Feng
ZD
,
Janes
H
,
Bossuyt
PM
,
Potter
JD
. 
Pivotal evaluation of the accuracy of a biomarker used for classification or prediction: standards for study design
.
J Natl Cancer Inst
2008
;
100
:
1432
8
.
12.
Faca
V
,
Coram
M
,
Phanstiel
D
,
Glukhova
V
,
Zhang
Q
,
Fitzgibbon
M
, et al
Quantitative analysis of acrylamide labeled serum proteins by LC-MS/MS
.
J Proteome Res
2006
;
5
:
2009
18
.
13.
Misek
DE
,
Kuick
R
,
Wang
H
,
Galchev
V
,
Deng
B
,
Zhao
R
, et al
A wide range of protein isoforms in serum and plasma uncovered by a quantitative intact protein analysis system
.
Proteomics
2005
;
5
:
3343
52
.
14.
Wang
H
,
Clouthier
SG
,
Galchev
V
,
Misek
DE
,
Duffner
U
,
Min
CK
, et al
Intact-protein-based high-resolution three-dimensional quantitative analysis system for proteome profiling of biological fluids
.
Mol Cell Proteomics
2005
;
4
:
618
25
.
15.
Wang
H
,
Hanash
S
. 
Intact-protein analysis system for discovery of serum-based disease biomarkers
.
In
:
Simpson
RJ
,
Greening
DW
,
editors
.
Serum/plasma proteomics: methods and protocols
.
New York
:
Humana Press
; 
2011
. p.
69
85
.
16.
Rauch
A
,
Bellew
M
,
Eng
J
,
Fitzgibbon
M
,
Holzman
T
,
Hussey
P
, et al
Computational Proteomics Analysis System (CPAS): an extensible, open-source analytic system for evaluating and publishing proteomic data and high throughput biological experiments
.
J Proteome Res
2006
;
5
:
112
21
.
17.
MacLean
B
,
Eng
JK
,
Beavis
RC
,
McIntosh
M
. 
General framework for developing and evaluating database scoring algorithms using the TANDEM search engine
.
Bioinformatics
2006
;
22
:
2830
2
.
18.
Keller
A
,
Nesvizhskii
AI
,
Kolker
E
,
Aebersold
R
. 
Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search
.
Anal Chem
2002
;
74
:
5383
92
.
19.
Nesvizhskii
AI
,
Keller
A
,
Kolker
E
,
Aebersold
R
. 
A statistical model for identifying proteins by tandem mass spectrometry
.
Anal Chem
2003
;
75
:
4646
58
.
20.
Faca
VM
,
Ventura
AP
,
Fitzgibbon
MP
,
Pereira-Faca
SR
,
Pitteri
SJ
,
Green
AE
, et al
Proteomic analysis of ovarian cancer cells reveals dynamic processes of protein secretion and shedding of extra-cellular domains
.
PLoS One
2008
;
3
:
e2425
.
21.
Kersey
PJ
,
Duarte
J
,
Williams
A
,
Karavidopoulou
Y
,
Birney
E
,
Apweiler
R
. 
The International Protein Index: an integrated database for proteomics experiments
.
Proteomics
2004
;
4
:
1985
8
.
22.
Green
RA
,
Wollman
R
,
Kaplan
KB
. 
APC and EB1 function together in mitosis to regulate spindle dynamics and chromosome alignment
.
Mol Biol Cell
2005
;
16
:
4609
22
.
23.
Shirai
R
,
Hirano
F
,
Ohkura
N
,
Ikeda
K
,
Inoue
S
. 
Up-regulation of the expression of leucine-rich alpha(2)-glycoprotein in hepatocytes by the mediators of acute-phase response
.
Biochem Biophys Res Commun
2009
;
382
:
776
9
.
24.
O'Donnell
LC
,
Druhan
LJ
,
Avalos
BR
. 
Molecular characterization and expression analysis of leucine-rich alpha 2-glycoprotein, a novel marker of granulocytic differentiation
.
J Leukoc Biol
2002
;
72
:
478
85
.
25.
Wolf
E
,
Lahm
H
,
Wu
MY
,
Wanke
R
,
Hoeflich
A
. 
Effects of IGFBP-2 overexpression in vitro and in vivo
.
Pediatr Nephrol
2000
;
14
:
572
8
.
26.
Jiang
W
,
Li
X
,
Rao
SQ
,
Wang
LH
,
Du
L
,
Li
CX
, et al
Constructing disease-specific gene networks using pair-wise relevance metric: application to colon cancer identifies interleukin 8, desmin and enolase 1 as the central elements
.
BMC Syst Biol
2008
;
2
:
72
.
27.
Chignard
N
,
Shang
SF
,
Wang
H
,
Marrero
J
,
Brechot
C
,
Hanash
S
, et al
Cleavage of endoplasmic reticulum proteins in hepatocellular carcinoma: detection of generated fragments in patient sera
.
Gastroenterology
2006
;
130
:
2010
22
.
28.
Ren
H
,
Du
N
,
Liu
G
,
Hu
HT
,
Tian
W
,
Deng
ZP
, et al
Analysis of variabilities of serum proteomic spectra in patients with gastric cancer before and after operation
.
World J Gastroenterol
2006
;
12
:
2789
92
.
29.
Hung
KE
,
Faca
V
,
Song
K
,
Sarracino
DA
,
Richard
LG
,
Krastins
B
, et al
Comprehensive proteome analysis of an Apc mouse model uncovers proteins associated with intestinal tumorigenesis
.
Cancer Prev Res (Phila)
2009
;
2
:
224
33
.
30.
Fodde
R
,
Smits
R
,
Clevers
H
. 
APC, signal transduction and genetic instability in colorectal cancer
.
Nat Rev Cancer
2001
;
1
:
55
67
.
31.
Brocardo
M
,
Henderson
BR
. 
APC shuttling to the membrane, nucleus and beyond
.
Trends Cell Biol
2008
;
18
:
587
96
.
32.
Wen
Y
,
Eng
CH
,
Schmoranzer
J
,
Cabrera-Poch
N
,
Morris
EJS
,
Chen
M
, et al
EB1 and APC bind to mDia to stabilize microtubules downstream of Rho and promote cell migration
.
Nat Cell Biol
2004
;
6
:
820
30
.
33.
Liu
M
,
Yang
SB
,
Wang
YH
,
Zhu
HX
,
Yan
S
,
Zhang
W
, et al
EB1 acts as an oncogene via activating beta-catenin/TCF pathway to promote cellular growth and inhibit apoptosis
.
Mol Carcinog
2009
;
48
:
212
9
.
34.
Wang
YH
,
Zhou
XB
,
Zhu
HX
,
Liu
S
,
Zhou
CQ
,
Zhang
G
, et al
Overexpression of EB1 in human esophageal squamous cell carcinoma (ESCC) may promote cellular growth by activating beta-catenin/TCF pathway
.
Oncogene
2005
;
24
:
6637
45
.
35.
Ralhan
R
,
Desouza
LV
,
Matta
A
,
Tripathi
SC
,
Ghanny
S
,
Gupta
SD
, et al
Discovery and verification of head-and-neck cancer biomarkers by differential protein expression analysis using iTRAQ labeling, multidimensional liquid chromatography, and tandem mass spectrometry
.
Mol Cell Proteomics
2008
;
7
:
1162
73
.
36.
Orimo
T
,
Ojima
H
,
Hiraoka
N
,
Saito
S
,
Kosuge
T
,
Kakisaka
T
, et al
Proteomic profiling reveals the prognostic value of adenomatous polyposis coli-end-binding protein 1 in hepatocellular carcinoma
.
Hepatology
2008
;
48
:
1851
63
.
37.
Wu
C
,
Orozco
C
,
Boyer
J
,
Leglise
M
,
Goodale
J
,
Batalov
S
, et al
BioGPS: an extensible and customizable portal for querying and organizing gene annotation resources
.
Genome Biol
2009
;
10
:
R130
.
38.
Berglund
L
,
Björling
E
,
Oksvold
P
,
Fagerberg
L
,
Asplund
A
,
Szigyarto
C
, et al
A genecentric Human Protein Atlas for expression profiles based on antibodies
.
Mol Cell Proteomics
2008
;
7
:
2019
27
.
39.
Heo
SH
,
Lee
SJ
,
Ryoo
HM
,
Park
JY
,
Cho
JY
. 
Identification of putative serum glycoprotein biomarkers for human lung adenocarcinoma by multilectin affinity chromatography and LC-MS/MS
.
Proteomics
2007
;
7
:
4292
302
.
40.
Kakisaka
T
,
Kondo
T
,
Okano
T
,
Fujii
K
,
Honda
K
,
Endo
M
, et al
Plasma proteomics of pancreatic cancer patients by multi-dimensional liquid chromatography and two-dimensional difference gel electrophoresis (2D-DIGE): up-regulation of leucine-rich alpha-2-glycoprotein in pancreatic cancer
.
J Chromatogr B Analyt Technol Biomed Life Sci
2007
;
852
:
257
67
.
41.
Okano
T
,
Kondo
T
,
Kakisaka
T
,
Fujii
K
,
Yamada
M
,
Kato
H
, et al
Plasma proteomics of lung cancer by a linkage of multi-dimensional liquid chromatography and two-dimensional difference gel electrophoresis
.
Proteomics
2006
;
6
:
3938
48
.
42.
Serada
S
,
Fujimoto
M
,
Ogata
A
,
Terabe
F
,
Hirano
T
,
Iijima
H
, et al
iTRAQ-based proteomic identification of leucine-rich alpha-2 glycoprotein as a novel inflammatory biomarker in autoimmune diseases
.
Ann Rheum Dis
2010
;
69
:
770
4
.
43.
Hofman
PM
. 
Pathobiology of the neutrophil-intestinal epithelial cell interaction: role in carcinogenesis
.
World J Gastroenterol
2010
;
16
:
5790
800
.
44.
Sun
DT
,
Kar
S
,
Carr
BI
. 
Differentially expressed genes in TGF-beta 1 sensitive and resistant human hepatoma-cells
.
Cancer Lett
1995
;
89
:
73
9
.
45.
Munoz
NM
,
Upton
M
,
Rojas
A
,
Washington
MK
,
Lin
L
,
Chytil
A
, et al
Transforming growth factor beta receptor type II inactivation induces the malignant transformation of intestinal neoplasms initiated by Apc mutation
.
Cancer Res
2006
;
66
:
9837
44
.
46.
Ballesta
AM
,
Molina
R
,
Filella
X
,
Jo
J
,
Gimenez
N
. 
Carcinoembryonic antigen in staging and follow-up of patients with solid tumors
.
Tumor Biol
1995
;
16
:
32
41
.
47.
Hostetter
RB
,
Augustus
LB
,
Mankarious
R
,
Chi
KF
,
Fan
D
,
Toth
C
, et al
Carcinoembryonic antigen as a selective enhancer of colorectal-cancer metastasis
.
J Natl Cancer Inst
1990
;
82
:
380
5
.
48.
Fletcher
RH
. 
Carcinoembryonic antigen
.
Ann Internal Med
1986
;
104
:
66
73
.
49.
McCall
JL
,
Black
RB
,
Rich
CA
,
Harvey
JR
,
Baker
RA
,
Watts
JM
, et al
The value of serum carcinoembryonic antigen in predicting recurrent disease following curative resection of colorectal-cancer
.
Dis Colon Rectum
1994
;
37
:
875
81
.
50.
Wang
JY
,
Tang
RP
,
Chiang
JM
. 
Value of carcinoembryonic antigen in the management of colorectal-cancer
.
Dis Colon Rectum
1994
;
37
:
272
7
.
51.
Bannura
G
,
Cumsille
MA
,
Contreras
J
,
Barrera
A
,
Melo
C
,
Soto
D
. [ 
Carcinoembryonic antigen (CEA) as an independent prognostic factor in colorectal carcinoma]
.
Rev Med Chile
2004
;
132
:
691
700
.
52.
Cohen
P
,
Peehl
DM
,
Stamey
TA
,
Wilson
KF
,
Clemmons
DR
,
Rosenfeld
RG
. 
Elevated levels of insulin-like growth factor-binding protein-2 in the serum of prostate-cancer patients
.
J Clin Endocrinol Metab
1993
;
76
:
1031
5
.
53.
Flyvbjerg
A
,
Mogensen
O
,
Mogensen
B
,
Nielsen
OS
. 
Elevated serum insulin-like growth factor-binding protein 2 (IGFBP-2) and decreased IGFBP-3 in epithelial ovarian cancer: correlation with cancer antigen 125 and tumor-associated trypsin inhibitor
.
J Clin Endocrinol Metab
1997
;
82
:
2308
13
.
54.
Liou
JM
,
Shun
CT
,
Liang
JT
,
Chiu
HM
,
Chen
MJ
,
Chen
CC
, et al
Plasma insulin-like growth factor-binding protein-2 levels as diagnostic and prognostic biomarker of colorectal cancer
.
J Clin Endocrinol Metab
2010
;
95
:
1717
25
.
55.
Jenab
M
,
Riboli
E
,
Cleveland
RJ
,
Norat
T
,
Rinaldi
S
,
Nieters
A
, et al
Serum C-peptide, IGFBP-1 and IGFBP-2 and risk of colon and rectal cancers in the European Prospective Investigation into Cancer and Nutrition
.
Int J Cancer
2007
;
121
:
368
76
.
56.
Kaaks
R
,
Lundin
E
,
Manjer
J
,
Rinaldi
S
,
Biessy
C
,
Soderberg
S
, et al
Prospective study of IGF-I, IGF-binding proteins, and breast cancer risk, in Northern and Southern Sweden
.
Cancer Causes Control
2002
;
13
:
307
16
.
57.
Diehl
D
,
Hessel
E
,
Oesterle
D
,
Renner-Muller
I
,
Elmlinger
M
,
Langhammer
M
, et al
IGFBP-2 overexpression reduces the appearance of dysplastic aberrant crypt foci and inhibits growth of adenomas in chemically induced colorectal carcinogenesis
.
Int J Cancer
2009
;
124
:
2220
5
.
58.
Zhang
Q
,
Faca
V
,
Hanash
S
. 
Mining the plasma proteome for disease applications across seven logs of protein abundance
.
J Proteome Res
2011
;
10
:
46
50
.