Abstract
The development of noninvasive methods for early detection of colon cancer is critical for the successful management of this disease. Using a targeted quantitative proteomics technique, we assessed the ability of 12 serum proteins to detect the presence of colonic polyps in the ApcPirc/+ rat model of familial colon cancer. Serum protein candidates were selected from gene transcripts upregulated in colonic tumors of ApcPirc/+ rats and from a prior study of serum proteins differentially expressed in mice carrying intestinal adenomas. Proteins were quantified at early stages of polyp formation in a rat cohort monitored longitudinally by colonoscopy over a period of 75 days. Of the 12 proteins monitored at three distinct time points, seven showed differential expression in at least one time point in the serum from ApcPirc/+ rats compared with wild-type rats. Tumor multiplicity correlated with protein expression changes, and most tumors grew during the study. EGFR, LRG1, ITIH4, and F5 displayed the most robust tumor-associated protein expression changes over time. Receiver operator characteristic analysis using these four proteins resulted in a sensitivity of 100%, a specificity of 80%, and an area under the curve of 0.93 at 135 days of age, when the Pirc rats bore an average of 19 tumors in the colon and seven in the small intestine. The results of this study demonstrate that the quantitative analysis of a panel of serum proteins can detect the presence of early intestinal tumors in a rat model, and provides support for future measurements in humans. Cancer Prev Res; 7(11); 1160–9. ©2014 AACR.
Introduction
Colorectal cancer is a major cause of cancer-related morbidity and mortality in modernized nations, and is increasing in frequency in the developing world (1). Although early detection of localized colorectal cancer often leads to complete cure by polypectomy or surgery, the modalities for early detection are currently limited in sensitivity and specificity, have low patient adherence to screening recommendations, and place a strain on the capacity of clinical gastroenterologists (2, 3). The current recommended screening procedures (colonoscopy, CT colonography, or Fecal occult blood test) can be nonspecific, insensitive for the earliest operable lesions, or highly invasive (4, 5). In contrast, a detection modality based upon blood samples can achieve broader patient compliance and clinical coverage. This study begins to address whether the analysis of the serum proteome can meet the need for improved early detection methods to overcome these issues.
With proper caveats, the use of animal models in a controlled environment can guide the understanding and treatment of human disease. In previous studies, we have used the ApcMin/+ mouse model of familial intestinal cancer to identify proteins that are differentially expressed in tumor-bearing versus tumor-free colonic tissue and in the serum of ApcMin/+ versus Apc+/+ mice (6, 7). However, ApcMin/+ mice develop adenomas predominantly in the small intestine, not the colon, which confounds the interpretation of using the ApcMin/+ model for colon cancer studies (7–9). In contrast, ApcPirc/+ rats develop adenomas and localized adenocarcinomas preferentially in the colon, as do humans with familial inherited and sporadic forms of the disease (10). The localization of tumors predominantly in the colon has the added advantage of using colonoscopy to annotate the growth patterns of individual colonic tumors over time. For these reasons, we have explored the use of ApcPirc/+ rats for identifying serum proteins that may be useful as biomarkers for the presence of colonic tumors.
A high-throughput, quantitative selected reaction monitoring (SRM) mass spectrometry (MS) assay was used to validate proteins differentially expressed in ApcPirc/+ rat serum. Candidate proteins were selected from two discovery modes. First, transcriptome analysis identified transcripts whose proteins are secreted and upregulated in ApcPirc/+ tumor tissue compared with matched normal mucosa. Second, proteins found to be differentially expressed in our prior study of serum from ApcMin/+ mice were selected to determine whether they could also detect polyps in the more colon-specific ApcPirc/+ rat model (7). The sensitivity and specificity to detect the presence of colon polyps of each protein, both individually and as part of a panel, were determined by receiver operator characteristic (ROC) analysis. This study showed that the levels of EGFR, LRG1, ITIH4, and F5 have significant diagnostic potential in ApcPirc/+ rats, both as individual markers and collectively as a panel.
Materials and Methods
Animal breeding and maintenance
Rats were maintained under a protocol approved by the Animal Care and Use Committee of the University of Wisconsin School of Medicine and Public Health and in a facility in the McArdle Laboratory approved by the American Association of Laboratory Animal Care. Rats were individually housed in standard caging with free access to Lab Diet 5020 chow and acidified water. Only male rats were used for the microarray and proteomics studies to eliminate potential confounding by estrus cycling in female rats. A 12:12 hour light:dark cycle was maintained throughout the experiments, and rats were all dissected within a 4-hour window to control for any variation due to circadian cycles.
F1 generation (ACIxF344)–ApcPirc/+ rats were generated by breeding female ACI Apc+/+ rats (Harlan) to male F344N/Tac coisogenic ApcPirc/+ (Pirc) rats (developed in the laboratory of W.F. Dove and available through Taconic; ref. 11). These “F1-Pirc” rats show an increased tumor multiplicity and decreased time to tumor emergence compared with the standard coisogenic F344N/Tac-Pirc rat. One group of 97-day-old F1-Pirc rats was used for the microarray study; a separate group was used for real-time PCR confirmation of candidate transcript levels. An additional two groups, an F1-Pirc and an (ACI X F344)F1Apc+/+ “F1-wildtype” cohort, were followed longitudinally from 60 to 135 days of age for the proteomics study.
The microarray rat cohort
The microarray experiments follow the nomenclature, descriptions, and data sharing recommended by the MIAME Guidelines (12). Data have been deposited in NCBI's Gene Expression Omnibus (13) and are accessible through GEO Series accession number GSE54035. To measure the levels of transcripts that were differentially expressed in tumors, RNA was isolated from 10 colonic tumor samples and four matched normal tissue samples from four F1-Pirc rats. Tumor samples were obtained by harvesting one-quarter of the tumor. For the collection of normal intestinal tissue, a scalpel blade was used to gently scrape the luminal surface of the distal colon no closer than 8 mm away from any tumor. Each normal tissue or tumor sample was homogenized in a tube containing RLTplus buffer (Qiagen) and frozen at −80°C until use. RNA was extracted from the sample using an Allprep DNA/RNA Mini Kit (Qiagen), following the manufacturer's protocol. Total RNA (100 ng) was labeled with a Low Input Quick Amp kit with Cy3 dye (Agilent Technologies) according to the manufacturer's instructions. RNA collected from normal tissue was labeled with Cy5 dye. Samples were evenly distributed and hybridized to Agilent 4 × 44 k whole genome microarrays. Following incubation, arrays were scanned on an Agilent High-Resolution Microarray Scanner at 3-μm resolution with a 20-bit data format. Files were extracted using Agilent Feature Extraction version 10.7. Data were then imported into Genome Suite software for analysis (Partek). A list of genes differentially expressed between normal colonic tissue and tumor was generated using the criterion of differential expression equal to or greater than 5-fold with a false discovery rate (FDR) equal to or less than 5%.
Transcriptome candidates were verified by real-time PCR using experiments following the nomenclature and description recommended by the MIQE Guidelines (14). Hydrolysis probes labeled with FAM dye for Cd44 (exons 16–17) and Mmp7 (exons 4–5) Applied Biosystems and probes for Cfi (exons 10–12), Lrg1 (exons 1–2) and Mmp10 (exons 8–10) were purchased from Integrated DNA Technologies. Gapdh labeled with VIC dye (Applied Biosystems) was used as a reference gene. Matched tumor and normal colon samples were analyzed from 4 individual rats. Each sample was run in triplicate and technical error between replicates did not exceed 7%. Fold-change expression of each gene was determined by calculating 2n for each sample, where n equals the amplification cycle difference between Gapdh and the test probe.
The longitudinal rat cohorts
Blood samples were collected, processed, and stored using standard operating procedures published by the Early Detection Research Network within a 2-hour time window (15). At 60, 90, and 135 days of age, approximately 1.5 mL of blood was collected from the retro-orbital sinus into Protein LoBind tubes (Eppendorf) from 14 F1-Pirc and 10 F1-wildtype rats anesthetized with 3% isoflurane. Blood was left to clot at room temperature for 30 to 60 minutes before centrifugation at room temperature for 20 minutes at 1,200 × g (Eppendorf 5415c). The serum was then transferred to new Protein LoBind tubes using sterile LoRetention Dualfilter pipet tips (Eppendorf) and frozen at −80°C until use.
Following blood collection each animal underwent endoscopy to enumerate the number of visible tumors and to determine the growth pattern of each individual tumor. Rats were anesthetized with 3% isoflurane and placed on a sterile surgical field. The colon was flushed with warm saline to remove any fecal material and to provide lubrication. Video and still images of each colon tumor were captured at each visit and were visually compared by three blinded observers after both visits. Each tumor was given one of three scores: growing, static, or regressing. A consensus score was generated for each tumor based on agreement between at least two of the three observers. Rats were sacrificed at 135 days to determine total intestinal tumor multiplicity. Formalin-fixed tumors in the small intestine and colon were counted at ×10 magnification on an Olympus dissecting microscope.
Protein candidate selection
Serum proteins for SRM-MS analysis were chosen using two strategies. First, protein candidates were chosen corresponding to transcripts upregulated in colon tumors in the microarray study. These candidates were nominated using three criteria: those with RNA levels upregulated at least 5-fold in colonic neoplasms compared with normal tissue after filtering to a 0.05 FDR; proteins predicted or known to be secreted (16); and proteins with potential biological significance to colon cancer (17). The second strategy of candidate selection used quantitative proteomic data from the serum of the ApcMin/+ mouse compared with wild-type, as previously described (7). An isotopically labeled peptide reference standard unique to each selected biomarker candidate was synthesized by the UW-Madison Biotechnology Center's peptide synthesis core facility, with the incorporation of one 13C15N–labeled amino acid in each reference peptide.
Sample preparation for quantitative proteomic analysis
Serum was washed five times with 10 kDa MWCO Amicon Centriprep units with 5 mL of 20% acetonitrile/80% Milli-Q H2O at 1,500 × g for 1 hour at 4°C followed by lyophilization. Albumin, transferrin, and IgG were removed from a 2-mg aliquot of resolublized serum, using a 4.6- × 100-mm mouse MARS column (Agilent Technologies) according to the manufacturer's protocol. Proteins not retained by the column were collected, concentrated, and precipitated with trichloroacetic acid as previously described (7). A Pierce BCA protein concentration assay was performed on resolublized samples according to the manufacturer's instructions (Thermo Fisher Scientific).
A 100-μg aliquot of serum protein from each sample underwent reduction and alkylation of cysteine residues, followed by digestion at 37°C overnight using sequencing grade porcine trypsin (Promega) at a 1:50 trypsin–protein ratio. Before reduction and alkylation, the stable isotope–labeled peptide reference standard of each target endogenous peptide was added to the serum protein sample. The resultant peptides were desalted on SPEC C18 Pipette Tips (Agilent Technologies) according to the manufacturer's instructions. Eluted peptides were dried using a vacuum centrifuge.
Liquid chromatography and mass spectrometry
Liquid chromatography separation of a 2-μg sample was achieved by reversed-phase chromatography using a NanoLC Ultra 2D HPLC (Eksigent) equipped with a nanoflex cHiPLC set to 37°C. A 90-minute gradient was used for peptide separation as described previously (7), followed by elution directly into a 5500 QTrap (AbSciex). Peptide precursors were selected in quadrupole 1 (Q1), fragmented in q2, and the top 3 to 4 transitions were selected for monitoring in Q3. All Q1 and Q3 masses were measured at unit resolution. A 7-minute scheduling window was applied with a 1.5-second cycle time. Method development and peak analysis were done using Skyline software (18).
Mass spectrometry data processing and analysis
Mass spectrometry results were imported into Skyline and peaks integrated. Each peptide was evaluated using the average peak area of the most intense transition over three technical replicates. For each protein, an average ratio of F1-Pirc/F1–wild-type was calculated for each of the peptides. P-values were obtained using a two-tailed Student's t test assuming a normal distribution.
The diagnostic capability of serum protein markers on an individual level and as a panel was determined by ROC analysis using the JROCFIT web-based calculator (19) using the same test set of 14 F1-Pirc and 10 F1–wild-type animals. Data format 2 (binary response with confidence rating) was used with a total of three rating categories: (i) low confidence; (ii) intermediate confidence; and (iii) high confidence. First, each protein was rated for its diagnostic capacity as an individual protein. Next, a group of four specific proteins, chosen on the basis of their individual ROC analyses, was evaluated for its diagnostic potential as a panel. Additional details can be found in the Supplementary Materials and Methods.
Results
Transcriptome and proteome discovery studies identified protein biomarker candidates for validation in F1-Pirc rats
A total of 928 microarray probes were differentially expressed by at least 5-fold between normal colonic tissue and tumors from F1-Pirc rats. The decision to compare normal and colonic tissue from the same F1-Pirc rat was based on our finding that normal colonic tissue from F1-Pirc and wild-type rats showed only six differentially expressed genes between the two sources (data not shown). Thus, normal F1-Pirc intestinal mucosa sufficiently represents gene expression in wild-type tumor-free intestinal mucosa. Pathology analysis of the tumors determined that they were adenomas (75%) or intramucosal carcinomas (25%), which correspond histologically to the earliest, operable stages of the human disease (10). In total, 543 probes were more highly expressed in tumor tissue, whereas the remaining 415 probes were more highly expressed in normal tissue.
For this study, we considered only those probes upregulated in tumor. The list of probes was narrowed to 5 transcriptome candidates based on the observable presence of their protein products in serum by mass spectrometry. The 5 upregulated transcriptome candidates selected for proteomic analysis were verified using RT-PCR (Fig. 1). Originally, we tested 12 transcriptome candidates by mass spectrometry but seven of the predicted secreted protein products were not visible in the SRM-MS assay (Supplementary Fig. S1). The final list of 12 proteins selected for proteomic validation included 3 candidates from the F1-Pirc rat tumor transcriptome analysis, and 7 candidates from the ApcMin/+ mouse serum proteomic discovery study, with complement factor I (CFI) and leucine-rich alpha-2 glycoprotein 1 (LRG1) shared between the two discovery strategies (Supplementary Table S1).
Gene transcripts upregulated in tumor compared with normal tissue using Agilent Whole Genome Microarray discovery and RT-PCR validation. Microarray candidates (bar graph) represent genes which: (i) show a 5-fold or greater upregulation in mRNA expression levels in tumors, (ii) code for known or predicted secreted proteins, and (iii) have some known biological significance to human colon cancer. RT-PCR data (numbers above bars) represents average fold change between F1-Pirc tumor/F1-Pirc normal epithelium in 4 animals, confirming the microarray analysis. For Mmp10, in 2 of the 4 animals, we were unable to detect the Mmp10 transcript. Therefore, Mmp10 might have variable expression in colonic tumors, but further investigation is needed to clarify.
Gene transcripts upregulated in tumor compared with normal tissue using Agilent Whole Genome Microarray discovery and RT-PCR validation. Microarray candidates (bar graph) represent genes which: (i) show a 5-fold or greater upregulation in mRNA expression levels in tumors, (ii) code for known or predicted secreted proteins, and (iii) have some known biological significance to human colon cancer. RT-PCR data (numbers above bars) represents average fold change between F1-Pirc tumor/F1-Pirc normal epithelium in 4 animals, confirming the microarray analysis. For Mmp10, in 2 of the 4 animals, we were unable to detect the Mmp10 transcript. Therefore, Mmp10 might have variable expression in colonic tumors, but further investigation is needed to clarify.
Protein expression over time revealed differential expression concordant with increases in tumor multiplicity
Quantitative proteomics revealed that MMP7, LRG1, ITIH4, VTN, HPX, and F5 proteins show increased levels in blood serum over time (Fig. 2A, Table 1, Supplementary Table S2). These data correspond with the transcriptome discovery data (Fig. 1) and our ApcMin/+ mouse proteomics discovery and validation data (7). Average EGFR expression in F1-Pirc rats was significantly downregulated at 135 days, as observed in our prior proteomics discovery study (7). Although it was expected that ITIH3, CFI, MMP10, and CD44 would show upregulation and that COL1A1 would be downregulated, no statistically significant expression changes of these candidates were observed in the serum proteome of tumor-bearing F1-Pirc rats. There are several plausible explanations for this lack of observable change in these candidates, including: that the contribution of secreted protein from the tumor is overwhelmed by expression from other sources in the rat; that gene expression does not correlate with serum protein expression; that protein expression of a candidate determined by untargeted mass spectrometry (7) was a false discovery; that the 135-day time point is too early to observe changes in the blood for these proteins; or that the biology of the F1-Pirc rat is different from that of the ApcMin/+ mouse. Despite these confounding factors, seven proteins showed significant changes in levels of serum in tumor-bearing Pirc rats that matched trends observed in our discovery studies.
Protein expression in serum displayed over a time course (A) and as a function of large intestinal tumor burden (B). Over time, a trend in either upregulation or downregulation for 7 of the 12 proteins was observed for F1-Pirc rats compared with wild-type. Protein expression levels based on tumor count showed similar trends. The “n” value represents the number of F1-Pirc biological samples that fall into each range of tumor counts. Error bars, biological SE.
Protein expression in serum displayed over a time course (A) and as a function of large intestinal tumor burden (B). Over time, a trend in either upregulation or downregulation for 7 of the 12 proteins was observed for F1-Pirc rats compared with wild-type. Protein expression levels based on tumor count showed similar trends. The “n” value represents the number of F1-Pirc biological samples that fall into each range of tumor counts. Error bars, biological SE.
Summary of protein expression and statistical analysis for the individual protein biomarker candidates
Protein name . | Protein symbol . | NCBI number . | Time point (days of age) . | Average expression ratio (Pirc/WT) . | P . | Sensitivity . | Specificity . | AUC . | F1–wild-type variance over time . |
---|---|---|---|---|---|---|---|---|---|
Matrix metalloproteinase-7 | MMP7 | NP_036996 | 60 | 1.12 | 0.46 | ND | ND | ND | 25.7% |
90 | 1.38 | 0.04 | 57.1% | 80.0% | 0.664 | ||||
135 | 1.74 | 0.004 | 85.7% | 80.0% | 0.843 | ||||
Leucine-rich alpha-2 glycoprotein | LRG1 | NP_001009717 | 60 | 1.07 | 0.06 | 16.7% | 100.0% | 0.674 | 12.9% |
90 | 1.21 | 0.03 | 64.3% | 90.0% | 0.857 | ||||
135 | 1.43 | <0.001 | 92.9% | 90.0% | 0.907 | ||||
Inter-alpha trypsin inhibitor, heavy chain 4 | ITIH4 | NP_062242 | 60 | 1.11 | 0.06 | 50.0% | 83.3% | 0.701 | 15.0% |
90 | 1.14 | 0.03 | 28.6% | 100.0% | 0.649 | ||||
135 | 1.37 | 0.001 | 78.6% | 90.0% | 0.871 | ||||
Vitronectin | VTN | NP_062029 | 60 | 1.03 | 0.61 | 8.3% | 91.7% | 0.504 | 16.2% |
90 | 1.12 | 0.001 | 35.7% | 100.0% | 0.821 | ||||
135 | 1.20 | 0.02 | 71.4% | 90.0% | 0.854 | ||||
Hemopexin | HPX | NP_445770 | 60 | 1.15 | 0.006 | 50.0% | 100.0% | 0.708 | 23.3% |
90 | 1.26 | <0.001 | 78.6% | 100.0% | 0.882 | ||||
135 | 1.43 | 0.003 | 85.7% | 80.0% | 0.792 | ||||
Epidermal growth factor receptor | EGFR | NP_113695 | 60 | 0.97 | 0.33 | 8.3% | 100.0% | 0.632 | 11.8% |
90 | 0.87 | 0.002 | 50.0% | 100.0% | 0.832 | ||||
135 | 0.65 | <0.001 | 100.0% | 80.0% | 0.939 | ||||
Coagulation factor V | F5 | NP_001041343 | 60 | 1.00 | 0.94 | 8.3% | 100.0% | 0.545 | 11.5% |
90 | 1.08 | 0.08 | 21.4% | 100.0% | 0.679 | ||||
135 | 1.24 | 0.007 | 64.3% | 90.0% | 0.757 | ||||
Inter-alpha trypsin inhibitor, heavy chain H3 | ITIH3 | NP_059047 | 60 | 1.03 | 0.57 | 25.0% | 91.7% | 0.615 | 16.3% |
90 | 1.07 | 0.02 | 14.3% | 100.0% | 0.679 | ||||
135 | 1.05 | 0.34 | 14.3% | 90.0% | 0.428 | ||||
Complement factor I | CFI | NP_077071 | 60 | 1.04 | 0.59 | 16.7% | 91.7% | 0.576 | 23.9% |
90 | 1.08 | 0.26 | 21.4% | 90.0% | 0.867 | ||||
135 | 1.13 | 0.24 | 50.0% | 80.0% | 0.820 | ||||
Collagen, type I, alpha 1 | COL1A1 | NP_445756 | 60 | 1.11 | 0.09 | 8.3% | 91.7% | 0.309 | 57.6% |
90 | 1.1 | 0.11 | 7.1% | 80.0% | 0.371 | ||||
135 | 0.91 | 0.18 | 42.9% | 70.0% | 0.592 | ||||
Matrix metalloproteinase 10 | MMP10 | NP_598198 | 60 | 1.02 | 0.81 | 8.3% | 83.3% | 0.462 | 12.0% |
90 | 1.03 | 0.30 | 7.1% | 100.0% | 0.561 | ||||
135 | 0.97 | 0.48 | 0.0% | 90.0% | 0.482 | ||||
CD44 antigen | CD44 | NP_037056 | 60 | 1.05 | 0.48 | 16.7% | 75.0% | 0.286 | 17.8% |
90 | 1.07 | 0.25 | 21.4% | 90.0% | 0.755 | ||||
135 | 0.91 | 0.33 | 7.1% | 80.0% | 0.672 |
Protein name . | Protein symbol . | NCBI number . | Time point (days of age) . | Average expression ratio (Pirc/WT) . | P . | Sensitivity . | Specificity . | AUC . | F1–wild-type variance over time . |
---|---|---|---|---|---|---|---|---|---|
Matrix metalloproteinase-7 | MMP7 | NP_036996 | 60 | 1.12 | 0.46 | ND | ND | ND | 25.7% |
90 | 1.38 | 0.04 | 57.1% | 80.0% | 0.664 | ||||
135 | 1.74 | 0.004 | 85.7% | 80.0% | 0.843 | ||||
Leucine-rich alpha-2 glycoprotein | LRG1 | NP_001009717 | 60 | 1.07 | 0.06 | 16.7% | 100.0% | 0.674 | 12.9% |
90 | 1.21 | 0.03 | 64.3% | 90.0% | 0.857 | ||||
135 | 1.43 | <0.001 | 92.9% | 90.0% | 0.907 | ||||
Inter-alpha trypsin inhibitor, heavy chain 4 | ITIH4 | NP_062242 | 60 | 1.11 | 0.06 | 50.0% | 83.3% | 0.701 | 15.0% |
90 | 1.14 | 0.03 | 28.6% | 100.0% | 0.649 | ||||
135 | 1.37 | 0.001 | 78.6% | 90.0% | 0.871 | ||||
Vitronectin | VTN | NP_062029 | 60 | 1.03 | 0.61 | 8.3% | 91.7% | 0.504 | 16.2% |
90 | 1.12 | 0.001 | 35.7% | 100.0% | 0.821 | ||||
135 | 1.20 | 0.02 | 71.4% | 90.0% | 0.854 | ||||
Hemopexin | HPX | NP_445770 | 60 | 1.15 | 0.006 | 50.0% | 100.0% | 0.708 | 23.3% |
90 | 1.26 | <0.001 | 78.6% | 100.0% | 0.882 | ||||
135 | 1.43 | 0.003 | 85.7% | 80.0% | 0.792 | ||||
Epidermal growth factor receptor | EGFR | NP_113695 | 60 | 0.97 | 0.33 | 8.3% | 100.0% | 0.632 | 11.8% |
90 | 0.87 | 0.002 | 50.0% | 100.0% | 0.832 | ||||
135 | 0.65 | <0.001 | 100.0% | 80.0% | 0.939 | ||||
Coagulation factor V | F5 | NP_001041343 | 60 | 1.00 | 0.94 | 8.3% | 100.0% | 0.545 | 11.5% |
90 | 1.08 | 0.08 | 21.4% | 100.0% | 0.679 | ||||
135 | 1.24 | 0.007 | 64.3% | 90.0% | 0.757 | ||||
Inter-alpha trypsin inhibitor, heavy chain H3 | ITIH3 | NP_059047 | 60 | 1.03 | 0.57 | 25.0% | 91.7% | 0.615 | 16.3% |
90 | 1.07 | 0.02 | 14.3% | 100.0% | 0.679 | ||||
135 | 1.05 | 0.34 | 14.3% | 90.0% | 0.428 | ||||
Complement factor I | CFI | NP_077071 | 60 | 1.04 | 0.59 | 16.7% | 91.7% | 0.576 | 23.9% |
90 | 1.08 | 0.26 | 21.4% | 90.0% | 0.867 | ||||
135 | 1.13 | 0.24 | 50.0% | 80.0% | 0.820 | ||||
Collagen, type I, alpha 1 | COL1A1 | NP_445756 | 60 | 1.11 | 0.09 | 8.3% | 91.7% | 0.309 | 57.6% |
90 | 1.1 | 0.11 | 7.1% | 80.0% | 0.371 | ||||
135 | 0.91 | 0.18 | 42.9% | 70.0% | 0.592 | ||||
Matrix metalloproteinase 10 | MMP10 | NP_598198 | 60 | 1.02 | 0.81 | 8.3% | 83.3% | 0.462 | 12.0% |
90 | 1.03 | 0.30 | 7.1% | 100.0% | 0.561 | ||||
135 | 0.97 | 0.48 | 0.0% | 90.0% | 0.482 | ||||
CD44 antigen | CD44 | NP_037056 | 60 | 1.05 | 0.48 | 16.7% | 75.0% | 0.286 | 17.8% |
90 | 1.07 | 0.25 | 21.4% | 90.0% | 0.755 | ||||
135 | 0.91 | 0.33 | 7.1% | 80.0% | 0.672 |
NOTE: MMP7 expression was low in serum and was not identified at the 60-day time point.
Abbreviation: ND, not determined.
At the 60-, 90-, and 135-day time points, F1-Pirc rats averaged 2 ± 2, 7 ± 4, and 19 ± 5 colonic tumors, respectively (Supplementary Table S3). Tumor counts for the small intestine could be obtained only upon dissection at the terminal time point of 135 days, and averaged 13 ± 6 tumors. Of the 26 colonic tumors monitored by colonoscopy, 21 (81%) grew, 4 (15%) became static, and 1 regressed. These data were similar to previous observations of tumor multiplicity and growth in Pirc rats (20). The magnitude of expression change compared with wild-type rats was generally proportional to tumor burden (Fig. 2B). Thus, the seven proteins that were differentially expressed in serum may stem from the growing tumors or from the host response to their presence.
Protein candidates have diagnostic capability of detecting tumors
The statistical significance of the ratio of average protein expression in F1-Pirc rats compared with F1–wild-type rats was determined (Table 1). The average area ratios of MMP7, LRG1, ITIH4, VTN, HPX, EGFR, and F5 each changed significantly (P < 0.05) by 135 days. Except for F5, each of these proteins also shows a significant change by 90 days. However, at 60 days, only HPX showed statistically significant differential expression with a small upregulation of 1.15. These data suggest that the 60-day time point may be too early for the majority of these protein markers to serve individually as a means of detecting the presence of colon polyps from serum. A published histologic review of colon polyps from F1-Pirc rats shows that within the time range studied, the vast majority of tumors are noninvasive adenomas (11), suggesting that the differentially expressed proteins have the potential to identify polyps at the early adenoma stage. Furthermore, the lack of protein expression changes at 60 days gives increased confidence that changes detected at the 90- and 135-day time points are directly or indirectly owing to the presence of the polyps and not to an extratumoral effect of the Apc mutation.
ROC analysis was used to evaluate the potential of each protein to diagnose early colonic neoplasia among the group of 14 F1-Pirc and 10 F1–wild-type rats. Table 1 summarizes the sensitivity, specificity, and area under the curve (AUC) of each protein biomarker at 60, 90, and 135 days (Supplementary Fig. S2). As with the analysis by P-values, AUCs showed greater diagnostic potential at 90 and 135 days than at 60 days, with the sensitivity increasing as tumor burden increased. However, the central goal for early detection is to identify with high confidence any rats with polyps (low false-negative rate). The most predictive proteins were LRG1 and EGFR, which had 1 and 0 false negatives, respectively, at 135 days. These proteins also had very few false positives (1 and 2, respectively), again indicating that their changes in expression in serum are tumor-specific. Among other proteins that show encouraging sensitivity and specificity at the 135-day time point are MMP7, ITIH4, and HPX. The least sensitive blood proteins were MMP10 and CD44 (both originating from the transcriptome discovery study), which are unable to identify the presence of colon polyps.
Protein concentration varied by age in F1–wild-type rats over time
Most proteins vary in concentration under normal biologic conditions (21, 22). For example, the normal adult range for hemopexin in humans is 0.4 to 1.50 g/L (23). To identify concentration changes that are attributable to age, we analyzed F1–wild-type rats over the same range of ages as that studied for the F1-Pirc rats. In this analysis, hemopexin showed a variance of 23.3% across the time points (Table 1), in agreement with the variability reported in humans. The highest observed variance over time was COL1A1 at nearly 57% and the lowest was F5 with a variance of 11.5%. Choosing protein candidates with minimal age-dependent variability may reduce one source of biological variation and assist in identifying concentrations changes that are indicative of disease. Thus, F5 may serve as a more robust marker than COL1A1.
A protein panel has high sensitivity and specificity for identifying early-stage colon adenomas
On an individual level, the only protein with perfect sensitivity to detect tumor presence by its concentration in serum was EGFR (135 days). To improve the overall sensitivity for detecting the earliest adenomas, several proteins were analyzed for their predictive ability as a panel. LRG1, ITIH4, EGFR, and F5 were chosen because they showed significant differential expression in F1-Pirc rats and showed the least variance in F1–wild-type protein concentration over time (15% or less). Figure 3 and Table 2 highlight the sensitivity and specificity of this panel to identify rats with colonic polyps. Sensitivity was highest when the threshold for positive diagnosis was set to require only a single protein in the panel to show a positive result. Importantly, at 60 and 90 days the sensitivity increased using the four-protein panel. The panel reduced the number of false negatives from 6 (ITIH4 alone) to 4 at 60 days, and reduced it even further at 90 days from 5 (LRG1 alone) to 2. Maximally, 2 of 10 samples (20%) showed false positives at 60, 90, and 135 days.
ROC analysis of a panel comprised of EGFR, LRG1, ITIH4, and F5 for detecting tumors in F1-Pirc rats from serum. This curve represents the requirement for only one of the four proteins to show differential expression for positive diagnosis.
ROC analysis of a panel comprised of EGFR, LRG1, ITIH4, and F5 for detecting tumors in F1-Pirc rats from serum. This curve represents the requirement for only one of the four proteins to show differential expression for positive diagnosis.
Summary of ROC analysis for a panel of four biomarkers (F5, EGFR, LRG1, and ITIH4)
Minimum number of positive markers to make positive diagnosis . | Time point . | Sensitivity . | Specificity . | AUC . |
---|---|---|---|---|
1 Positive | 60 | 66.7% | 83.3% | 0.764 |
90 | 85.7% | 90.0% | 0.900 | |
135 | 100% | 80.0% | 0.932 | |
2 Positives | 60 | 16.7% | 100% | 0.764 |
90 | 42.9% | 100% | 0.843 | |
135 | 85.7% | 80.0% | 0.914 | |
3 or more Positives | 60 | 0% | 100% | 0.764 |
90 | 21.4% | 100% | 0.911 | |
135 | 78.6% | 90.0% | 0.904 |
Minimum number of positive markers to make positive diagnosis . | Time point . | Sensitivity . | Specificity . | AUC . |
---|---|---|---|---|
1 Positive | 60 | 66.7% | 83.3% | 0.764 |
90 | 85.7% | 90.0% | 0.900 | |
135 | 100% | 80.0% | 0.932 | |
2 Positives | 60 | 16.7% | 100% | 0.764 |
90 | 42.9% | 100% | 0.843 | |
135 | 85.7% | 80.0% | 0.914 | |
3 or more Positives | 60 | 0% | 100% | 0.764 |
90 | 21.4% | 100% | 0.911 | |
135 | 78.6% | 90.0% | 0.904 |
A more stringent criterion for a positive diagnosis is that two or more proteins must show a positive result. With this criterion, the number of false positives decreased, as expected, and the number of false negatives increased significantly. Because the major goal is to detect the presence of colonic tumors with high sensitivity and no false negatives, it is counterproductive to require simultaneous changes in multiple positive markers. Moreover, we observed that AUC values alone are not sufficient to determine the usefulness of a single protein or a panel for diagnostic purposes. The AUC value assumes that the sensitivity and specificity measurements are equally important (24). Therefore, both sensitivity and specificity values (Table 2) are required to fully assess the markers under consideration.
Discussion
A central feature of this study was that tumor burden could be gauged over time in Pirc rats by colonoscopy. Direct longitudinal analyses is increasingly important considering the new finding that 78% of human colon polyps do not grow, and even regress, whereas 22% of polyps grow (25). These growing lesions correlated to a high degree with adenomas that progressed histologically, whereas static or regressing lesions tended to be early adenomas or nonneoplastic lesions. Removal of nonthreatening polyps is risky and unnecessary, whereas growing tumors pose a significant health threat and must be removed early for an increased chance of survival (26, 27). In this study and others (20), we observed that 22% of adenomas monitored longitudinally in F1-Pirc rats were classified as static or regressing. Thus, this model has the ability to specifically simulate the distribution of tumor fates observed in humans to discover markers of tumors that will grow and progress. This simultaneously can increase the effectiveness of early detection efforts while decreasing overdiagnosis.
This study tested the validity of candidate protein biomarkers derived from two discovery studies in the serum of ApcPirc/+ rats to detect predominantly growing colonic polyps. Transcriptome discovery analysis identified likely secreted candidates that were specific to the animal model and to the tumor, whereas the proteomics discovery study provided a list of putative candidates already identified in the blood of an intestinal cancer model. The proteomic discovery candidates predominantly originated from sources beyond the tumor itself, indicating that putative markers for early detection in serum are not required to have originated from the tumor tissue. We have previously shown this phenomenon in the ApcMin/+ mouse, in which most of the 40 tumor-associated proteins were presumably secreted from organs other than the intestine, predominantly the liver of the tumor-bearing mouse (7). In this study, only 2 of the 12 transcriptome-derived candidates, LRG1 and MMP7, were upregulated in serum and show diagnostic promise, with LRG1 also having been discovered in the proteomics study. A total of 7 of the 12 transcriptome candidates were not analyzed in serum due to not being visible in the SRM-MS assay. These data indicate that the transcriptome analysis alone has limited potential to contribute to blood biomarker studies. In contrast, of the protein candidates selected from the ApcMin/+ mouse serum proteome discovery study, six of the nine proteins showed differential expression in the current ApcPirc/+ rat validation study and have higher potential for predicting blood protein biomarkers.
The longitudinal design of this study allowed us to investigate aspects of tumor biology undetectable by single-time point studies. For example, markers that appeared to be correlated with tumor burden may also be markers of advancing animal age. The longitudinal design also defined proteins with more stable concentration ranges over time, reducing one source of biological variation. Together, these features of the time course have produced a set of high priority biomarker candidates whose biological significance to colon cancer is discussed below.
Epidermal growth factor receptor
In this study, EGFR was markedly downregulated with high statistical confidence in rats with increasing tumor burden. Using EGFR as an indicator, all animals bearing adenomas were identified at the 135-day time point, making this protein the only marker with 100% sensitivity at any time point studied. Similar results were obtained in our ApcMin/+ mouse study, in which a significant decrease in EGFR expression in our discovery proteomics data was observed. However, our validation study in the mice was not consistent with this result. This discrepancy may reflect the known wide variability of EGFR expression in colon cancer (28). Moreover, there have been numerous reports of EGFR negative colon tumors, and many immunohistochemistry studies of invasive colon tumor tissue show an upregulation of EGFR, with a poor survival prognosis (29). Thus, the complexity of EGFR dysregulation in cancers is under intense study (30). In summary, whether EGFR expression goes up or down in serum, its concentration was significantly different in tumor-bearing animals compared with tumor-free animals in all of our proteomics analyses. Therefore, it should be considered an important indicator of tumor presence.
Leucine-rich alpha-2 glycoprotein 1
In our study, LRG1 was upregulated in the serum of F1-Pirc rats. In past studies, this acute-phase response protein has been upregulated in the blood of humans and murine models of colon cancer, with our F1-Pirc data providing further evidence of its upregulation in intestinal cancer (7, 9, 31). Studies have shown that this protein is also upregulated in the serum of patients with ulcerative colitis, suggesting that LRG1 may be a systemic indicator of intestinal disease (32). Until recently, the specific function of LRG1 was unknown. The suggested role of LRG1 in promoting angiogenesis via signaling by the TGFβ pathway through activation of ALK1-SMAD 1, 5, and 8, is a strong explanation for its upregulation in F1-Pirc rats and other colon cancer models (33). Angiogenesis, one of the fundamental attributes of tumor invasion and metastasis, can be triggered very early in tumor formation (34). Thus, LRG1 may be a versatile marker for the detection of early adenomas and later-stage intestinal cancers.
Inter-alpha trypsin inhibitor, heavy chain 4
We identified inter-alpha trypsin inhibitor, heavy chain 4 (ITIH4) as upregulated in the serum of F1-Pirc rats compared with wild-type. A related family member, ITIH3, was upregulated in our previous studies and others in the ApcMin/+ mouse (7, 9). Both of these proteins are inflammatory response proteins that are part of the heavy chain family of inter-alpha trypsin inhibitors. Inter-alpha trypsin inhibitors are known to bind and stabilize hyaluronic acid, thus assisting in the formation of large hyaluronan complexes. Increased size and complexity of these hyaluronan complexes are characteristics of the extracellular matrix of colon tumor tissue compared with normal colonic epithelium (35, 36). Thus, the increase in ITIH4 concentration in F1-Pirc rats may play an important role in extracellular matrix remodeling on colon tumor tissue.
Coagulation factor V (F5)
F5 is a clotting factor that has shown consistent upregulation in several intestinal cancer analyses of blood using murine models, including in the serum of F1-Pirc rats in this study (7, 8). F5 is a cofactor for activated coagulation factor X (Xa) that assists in cleaving prothrombin to form an active thrombin protein, which is vital for blood clotting (37). Perturbation in hemostasis is a commonly observed side effect of cancer, with venous thromboembolism as a documented complication in patients with colon cancer (38, 39). Moreover, polymorphisms in F5, such as F5 Leiden, are associated with increased risk of developing colorectal cancer (40). Thus, an increase in F5 concentration may play an important role in colon tumorigenesis, not only in F1-Pirc rats, but also in mice and humans.
Although EGFR, LRG1, ITIH4, and F5 show the best promise as diagnostic markers of colon cancer in ApcPirc/+ rats, other candidates should not be ruled out as candidate markers for early human colon cancer, recognizing the biological differences between rats and humans. For example, MMP7 protein is known to be upregulated both in colonic tumor tissues and in serum of patients with colorectal cancer (41). This study cites no differences in concentration with age in humans; therefore, the large age-dependent concentration variance in our study may be restricted to the rat model. Important prognostic markers such as CD44 showed no differential expression in this rat study but prior studies have shown that blood CD44 levels are elevated in humans with advanced stages of colon cancer and in gastric cancer (42, 43). Because the F1-Pirc rats used do not develop locally invasive adenocarcinomas until beyond the 135-day time point (10), this may explain why CD44 did not show changes in this study.
In conclusion, this report unites the power of targeted quantitative proteomic analysis by SRM-MS with the unique biology of the ApcPirc/+ rat to detect early, operable colonic neoplasms. Future studies using F1-Pirc rats with a low tumor burden (1 or 2 tumors compared with no tumors on the same genetic background; ref. 44) can be used to control for potential signals that may be extraneous to colonic tumors but generated by the broadly expressed Apc mutation. This low-multiplicity model we have developed coupled with quantitative tumor volume measurements (45) will enable us to discover whether even single tumors can be detected. In addition, one could determine whether proteins are differentially expressed between growing and static adenomas. SRM-MS analysis provides a standardized platform for future studies, and allows quantitation of many proteins in a single assay. Moreover, this technology can be used to explore proteomic differences in serum, plasma, and other biological materials across different species. The ApcPirc/+ rat has provided a controlled model system to preliminarily identify serum proteins that consistently change in the presence of growing colonic polyps. These data provide a candidate list of markers that can be transferred to human validation studies to test their sensitivity and specificity for the early detection of colon cancer.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: M.M. Ivancic, A.A. Irving, K.G. Jonakin, W.F. Dove, M.R. Sussman
Development of methodology: M.M. Ivancic, A.A. Irving, K.G. Jonakin, M.R. Sussman
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): M.M. Ivancic, A.A. Irving, M.R. Sussman
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): M.M. Ivancic, A.A. Irving, W.F. Dove, M.R. Sussman
Writing, review, and/or revision of the manuscript: M.M. Ivancic, A.A. Irving, W.F. Dove, M.R. Sussman
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): M.M. Ivancic, M.R. Sussman
Acknowledgments
The authors thank Linda Clipson for her assistance with data management and valuable input, along with Alexandra Shedlovsky, in the writing of this report. The authors thank Ruth Sullivan for scoring the pathology of the rat tumors. Jim Taubel, Terry Fritter, and our animal care staff have been responsible for the reliable maintenance of the Pirc rat colony in McArdle. The authors appreciate the guidance in mass spectrometry provided by Lori Van Ness, Dr. Gregory A. Barrett-Wilt, and Grzegorz Sabat at the UW-Madison Biotechnology Center Mass Spectrometry Facility. Dr. Melissa Boersma and Nina Porcaro provided synthetic peptides at the UW-Madison Biotechnology Center Peptide Synthesis core facility.
Grant Support
Financial support, including the source and number of grants, for each author: National Cancer Institute, R01 CA063677 and NIH Institutional Clinical and Translational Research Grant Program (PI: M. Dressner) (to W.F. Dove). University of Wisconsin Comprehensive Cancer Center Investigator-Initiated Pilot Project (to M.R. Sussman). NIH, 5 T32 GM08349 and Advanced Opportunity Fellowship through SciMed Graduate Research Scholars at University of Wisconsin-Madison (to M.M. Ivancic). Morgridge Predoctoral Fellowship and National Institute of Environmental Health Sciences Pre-Doctoral Training grant, T32ES007015-33 (to A.A. Irving). NHGRI training grant to the Genomic Sciences Training Program, 5T32HG002760 (to K.G. Jonakin).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.