Background: Although overall 5-year survival rates for ovarian cancer are poor (10-30%), stage I/IIa patients have a 95% 5-year survival. New biomarkers that improve the diagnostic performance of existing tumor markers are critically needed. A previous study by Zhang et al. reported identification and validation of three biomarkers using proteomic profiling that together improved early-stage ovarian cancer detection.

Methods: To evaluate these markers in an independent study population, postdiagnostic/pretreatment serum samples were collected from women hospitalized at the Mayo Clinic from 1980 to 1989 as part of the National Cancer Institute Immunodiagnostic Serum Bank. Sera from 42 women with ovarian cancer, 65 with benign tumors, and 76 with digestive diseases were included in this study. Levels of various posttranslationally forms of transthyretin and apolipoprotein A1 were measured in addition to CA125.

Results: Mean levels of five of the six forms of transthyretin were significantly lower in cases than in controls. The specificity of a model including transthyretin and apolipoprotein A1 alone was high [96.5%; 95% confidence interval (95% CI), 91.9-98.8%] but sensitivity was low (52.4%; 95% CI, 36.4-68.0%). A class prediction algorithm using all seven markers, CA125, and age maintained high specificity (94.3%; 95% CI, 89.1-97.5%) but had higher sensitivity (78.6%; 95% CI, 63.2-89.7%).

Conclusions: We were able to replicate the findings reported by Zhang et al. in an independently conducted blinded study. These results provide some evidence that including age of patient and these markers in a model may improve specificity, especially when CA125 levels are ≥35 units/mL. Influences of sample handling, subject characteristics, and other covariates on biomarker levels require further consideration in discovery and replication or validation studies. (Cancer Epidemiol Biomarkers Prev 2006;15(9):1641–6)

Recent advances in proteomic profiling technologies have made it possible to associate changes in protein expression with disease conditions, allowing the identification of biomarkers that can be combined to generate a multimarker panel to improve disease diagnosis. In particular, there have been several attempts to use serum proteins for the detection of early-stage ovarian cancer (1-4). In ovarian cancer, more than two thirds of cases are detected at an advanced stage, resulting in poor overall 5-year survival rates of 10% to 30% (5). This is in stark contrast to stage I/IIa patients with 95% 5-year survival (5). Longitudinal studies are under way in Europe, Japan, and the United States to evaluate screening strategies using CA125 and/or transvaginal sonography and their effect on overall cancer mortality. Although it is not known whether a survival benefit will be observed among patients diagnosed early through a screening regimen, the existing serum markers such as CA125, CA 72-4, and macrophage colony stimulating factor do not have adequate sensitivity or specificity to be used as screening tools (6). Proteomic technologies have been used to search for new biomarkers that may improve the diagnostic performance of existing markers.

In one study, Zhang et al. (2) reported using surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS) to identify three biomarkers that simultaneously improved the detection of early-stage ovarian cancer, in particular test specificity. Subsequently, quantitative chromatographic assays were developed for biomarkers identified at mass to charge ratios (m/z) 28,043 (apolipoprotein A1), 12,828 (a truncated form of transthyretin), and 3,272 (a fragment of inter-α-trypsin inhibitor IV). These three markers were found to differentiate ovarian cancer cases from healthy women with higher accuracy than CA125 alone. Applied to an independent validation sample set of sera from early-stage ovarian cancer cases and healthy controls, the sensitivity of a multivariable model combining the three biomarkers and CA125 at a matched specificity of 97% [95% confidence interval (95% CI), 89-100%] was 74% (95% CI, 52-90%), a small improvement over that of CA125 alone (65%; 95% CI, 43-84%; ref. 2). When compared at a fixed sensitivity of 83% (95% CI, 61-95%), the specificity of the model including CA125 and the proteomic biomarkers (94%; 95% CI, 85-98%) was significantly better than that of CA125 alone (52%; 95% CI, 39-65%).

To evaluate the discriminatory power of these markers, we measured them in an independent study population using a newly developed chromatographic SELDI-TOF-MS–based assay for quantification. We analyzed postdiagnostic/pretreatment serum samples collected from women hospitalized with ovarian cancer, benign ovarian tumors, and digestive disorders (hernias and gallstones) at the Mayo Clinic and stored at the National Cancer Institute Immunodiagnostic Serum Bank for apolipoprotein A1 and posttranslationally modified forms of transthyretin because these biomarkers could significantly discriminate ovarian cancer cases from controls (2). Cautioned by a number of recent articles that have called into question the reproducibility and relevance of reported proteomic biomarkers in cancer detection (6-9), and the absence of positive validation studies in the literature, we aimed to carefully address several previously raised points of criticism in our analysis, including sources of marker variability among noncancer controls.

Patient Population

Serum samples (n = 238) were selected from subjects whose blood was collected at the Mayo Clinic between 1980 and 1989 (10). Once collected, samples were shipped to the National Cancer Institute Immunodiagnosis Serum Bank (Rockville, MD) and stored at −70°C to −76°C until use. The National Cancer Institute Immunodiagnosis Serum Bank contains −70°C cryopreserved sera collected between 1980 and 1989 from Mayo Clinic patients diagnosed with a wide variety of malignant, benign, and nonneoplastic conditions (10). For the present study, sera were selected from all available samples from women with malignant (n = 45) or benign (n = 71) ovarian tumors, and from 122 female controls with abdominal hernias (ICD-9CM codes 553.1-553.3) or gallstones (ICD-9CM codes 574.1-574.4), frequency matched for age to cases. Information available on all subjects included age of patient at diagnosis, smoking status (never, past current, number of packs per day, years of smoking), ICD code for disease as well as the draw date and number of freeze-thaws for serum samples. Histologic subtype was known for all malignant and benign tumors. Stage and grade were provided for all cases. Six individuals whose serum samples were previously thawed and refrozen ≥1 times were excluded from analyses. Two cases were excluded for divergent histology (one diagnosed with mesothelioma and another with signet ring cell carcinoma) and one benign tumor was excluded for early age at diagnosis (3 years). Therefore, the final analyses included 42 ovarian cancer cases, 65 women with benign tumors, and 122 noncancer control digestive diseases.

Materials. Sinapinic acid (5-mg vial; Ciphergen, Fremont, CA); sample denaturation buffer (Ciphergen); OMAC-30 array (Ciphergen); Bioprocessor (Ciphergen); Q10 ProteinChip Array (Ciphergen), human prealbumin, purified (Biodesign International Saco, ME), and apolipoprotein A1 calibrators (K-ASSAY, Kamiya Biomedical, Seattle, WA).

Assays for Transthyretin and Apolipoprotein A1. To quantitatively measure and compare apolipoprotein A1 and posttranslationally modified forms of transthyretin concentrations in patient sera, a SELDI-TOF-MS Protein Chip array chromatographic assay was developed for each marker. The following procedures were done on a Tecan Aquarius-96 robotic workstation. Assays were run in triplicate. For apolipoprotein A1, IMAC ProteinChip Arrays were precharged with 50 μL of 50 mmol/L CuSO4 for 10 minutes, washed four times with deionized water, and then equilibrated with IMAC binding/washing buffer [50 mmol/L sodium phosphate, 250 mmol/L NaCl (pH 6.0)], twice each for 5 minutes. Five microliters of serum sample were denatured with 7.5 μL of sample denaturation buffer [9 mol/L urea, 2% CHAPS 50 mmol/L Tris-HCl (pH 9.0)] for 20 minutes on a shaker at room temperature. The spots were washed thrice with 150 μL of the binding/washing buffer, pipetting up and down 10 times for each wash. The spots were then rinsed twice with 150 μL of water. Excess water was aspirated and the spots allowed to air-dry for 10 minutes. To each spot, 1 μL of sinapinic acid matrix dissolved in 50% acetonitrile/0.5% trifluoroacetic acid in water at a concentration of 12.5 mg/mL was deposited. After allowing the spots to air-dry for 10 minutes, matrix was added again. The SELDI-based chromatographic assay for transthyretin was done using the anion exchange Q10 ProteinChip Array as previously described (11). A set of transthyretin calibrators (purified human prealbumin was reconstituted in binding/washing buffer) and a set of apolipoprotein A1 calibrators were used to monitor assay performance and assay linearity. Calibrants (representing serial dilutions of transthyretin and apolipoprotein A1) were treated exactly as serum samples. In the experimental runs, each cassette incorporated one set of calibrators with the remaining samples.

Data Acquisition. The arrays were read in a PCS4000 ProteinChip Reader, a time-lag focusing, linear laser desorption/ionization-time-of-flight mass spectrometer. The instruments were internally calibrated on a daily basis. All spectra were acquired in the positive-ion mode. Time-lag focus mass was set at 14,000 Da for transthyretin and 28,000 Da for apolipoprotein A1. Sampling rate was set at 800 MHz. Ions were extracted using 3.4-kV ion extraction pulse and accelerated to final velocity using 25-kV acceleration potential. The system employed a pulsed nitrogen laser at repetition rate of 20 Hz. Laser pulse energy of 1,500 to 2,000 nJ was delivered into a 100-μm diameter area, and this illuminated area was rastered across a 2-mm diameter sample spot. An automated analytic protocol was used to control the data acquisition process in most of the sample analysis. Each spectrum was an average of at least 1,000 laser shots and externally calibrated against a mixture of known peptides or proteins.

Data Processing. Raw data obtained from the PCS4000 ProteinChip Reader were first smoothed by a fixed-width moving average filter of 25 data points, and then a convex hull baseline subtraction algorithm was applied to the smoothed data. Data were then internally normalized using total ion current with the Ciphergen Express 3.0 software. Six peaks corresponding to transthyretin biomarkers, including a truncated form (T1; m/z 12,852), unmodified (T2; m/z 13,773), and four posttranslationally modified forms [sulfonated (T3; +SO2H; m/z 13,857), cysteinylated (T4; Cys; m/z 13,893), cysteinylated and glysinylated (T5; +Cys-Gly; m/z 13,933), and glutathionylated (T6; +Glut; m/z 14,111)], were manually labeled and their intensity recorded from the Q10 ProteinChip Array data while blinded to disease status (Fig. 1A). A peak corresponding to serum apolipoprotein A1 located at m/z 28,107 (A1) was manually selected and its intensity recorded from the IMAC array data (Fig. 1B).

Figure 1.

A. Posttranslationally modified transthyretin peaks quantitated on a Q10 ProteinChip Array. B. Apolipoprotein A1 peaks quantitated on an IMAC ProteinChip Array.

Figure 1.

A. Posttranslationally modified transthyretin peaks quantitated on a Q10 ProteinChip Array. B. Apolipoprotein A1 peaks quantitated on an IMAC ProteinChip Array.

Close modal

Immunoassay Protein Analyses. CA125 levels (units/mL) were obtained using an Elecsys 1010 immunoassay analyzer (Roche Diagnostics, Indianapolis, IN). Transthyretin levels (mg/mL) were obtained using an immunoprecipitation procedure (immunoturbidimetric assay; Pacific Biometrics, Seattle, WA). Samples were mixed with a polymeric enhancer and antiserum. A calibration curve was constructed from a series of five standards with known transthyretin concentrations. A Logit/Log plot was constructed and unknown values are determined by interpolation. Samples from quality control pools were assayed in each analytic run and results were compared with historical values for the aliquots of these QC pools.

Statistical Analyses

Demographic characteristics of study participants were compared using Fisher's exact tests. T-tests were used to assess differences in mean marker intensities between ovarian cancer cases, benign ovarian tumor, and digestive disease controls. We estimated the effects of covariates on levels of serum biomarkers among controls using ANOVA (PROC GLM; ref. 12). Letting zi denote the intensity for a marker for sample i, the general statistical model was zi = μ + bi + di + εi, where μ was the overall mean intensity, b was the age group quartile (<37, 37-56, 56-68 and ≥68), and d was the storage time indicator (0 if the sample was collected before 1986 and 1 if after). ε was a normally distributed random variable with mean zero and variance σe2 representing all sources of variance not specified in the model. Additionally, we used ANOVA to estimate least square means of marker levels (including CA125 and transthyretin) measured by the immunoassay for cases, benign disease, and controls combined, adjusted for a = “case status (ai),” b = “age group (bi),” and d = “storage time (di)” from the model zi = μ + ai + bi + di + εi. In a second model, we also included an indicator variable for the clinical cutoff value of CA125 (i.e., if CA125 ≥ 35 units/mL, it was one and zero otherwise).

Classification models were built to predict outcome status (case, benign, or control) and control subgroup (hernia versus gallstone). Independent predictors included the indicator variable for CA125 ≥35 units/mL, categorized age, transthyretin level (mg/mL), and log-transformed intensities corresponding to protein biomarkers T1 to T6 and apolipoprotein A1. Several classification models based on all known variables and variable subgroups were built using linear and quadratic discriminant analysis and nearest neighbor methods with PROC DISCRIM (12), as well as random forests (13). For details on these models and comparison of their performance in a previous study, see Wu et al. (14). To avoid overfitting and obtain unbiased estimation of the predictive ability of each model and analysis, misclassification rates were assessed using 10-fold cross-validation (15-17).

The mean storage time was shorter for patients with benign and malignant ovarian tumors compared with digestive disease controls (mean storage time, 17, 17, and 21 years, respectively). Using the chromatographic assay data from all 122 controls, we observed that protein levels decreased significantly with storage time for unmodified (T2; P = 0.008) and cysteinylated (T4; P = 0.05) transthyretin. To minimize bias in detection of protein levels that were negatively associated with storage time in addition to cancer status, the analysis of the biomarker data was limited to include only samples collected from 1983 to 1989, thereby decreasing the mean years in storage among control samples to 20.0 ± 1.2 years (Table 1). Only marker T4 (+Cys) remained significantly decreased with storage time after restriction of analysis to samples collected in 1983 or later. Among controls, age of patient at sample collection influenced levels for the truncated (T1; P = 0.01) form of transthyretin in models adjusted for years in storage.

Table 1.

Description of subjects included in ovarian cancer early detection study

CharacteristicsControls (N = 76)Benign tumor (N = 65)Cases (N = 42)
Age (y), mean (range) 59 (19-88) 41 (15-74) 61 (21-78) 
Smoking status, N (%)    
    Never 48 (63%) 37 (57%) 29 (69%) 
    Past 13 (17%) 11 (17%) 11 (26%) 
    Current 10 (13%) 17 (26%) 2 (5%) 
Packs/d (mean ± SD) 1.3 ± 2.9 1.0 ± 2.1 0.7 ± 2.1 
CA125 (units/mL), median (interquartile range) 15.1 (9.6-113.7) 18.3 (20.7-216.5) 468.0 (793.8-17,154) 
CA125 >35 units/mL, N (%) 7 (9.21%) 16 (24.62%) 37 (88.1%) 
Years in storage (mean ± SD) 20.0 ± 1.2 17.2 ± 1.6 16.8 ± 1.3 
Tumor grade, N (%)    
    Well differentiated   6 (14%) 
    Moderately differentiated   6 (14%) 
    Poorly/undifferentiated   29 (69%) 
    Unknown   1 (2%) 
Tumor stage, N (%)    
    I/II   9 (21%) 
    III/IV   33 (79%) 
CharacteristicsControls (N = 76)Benign tumor (N = 65)Cases (N = 42)
Age (y), mean (range) 59 (19-88) 41 (15-74) 61 (21-78) 
Smoking status, N (%)    
    Never 48 (63%) 37 (57%) 29 (69%) 
    Past 13 (17%) 11 (17%) 11 (26%) 
    Current 10 (13%) 17 (26%) 2 (5%) 
Packs/d (mean ± SD) 1.3 ± 2.9 1.0 ± 2.1 0.7 ± 2.1 
CA125 (units/mL), median (interquartile range) 15.1 (9.6-113.7) 18.3 (20.7-216.5) 468.0 (793.8-17,154) 
CA125 >35 units/mL, N (%) 7 (9.21%) 16 (24.62%) 37 (88.1%) 
Years in storage (mean ± SD) 20.0 ± 1.2 17.2 ± 1.6 16.8 ± 1.3 
Tumor grade, N (%)    
    Well differentiated   6 (14%) 
    Moderately differentiated   6 (14%) 
    Poorly/undifferentiated   29 (69%) 
    Unknown   1 (2%) 
Tumor stage, N (%)    
    I/II   9 (21%) 
    III/IV   33 (79%) 

The clinical characteristics and age distribution of subjects included in the final study analysis are presented in Table 1. Cancer cases and controls were similar with respect to age due to frequency matching on age. Benign ovarian tumor patients were younger than cases or digestive disease controls (P < 0.0001). The majority of serum samples from women with benign tumors and ovarian malignancies were collected between 1986 and 1989 (89% and 95%, respectively) whereas 66% of digestive disease control samples were collected before 1986 (data not shown). More women with benign tumors reported currently and ever smoking cigarettes. Smoking prevalence did not statistically differ between cancer cases and controls. Serum concentrations of transthyretin protein (mg/mL) in cases were significantly lower compared with digestive disease (P < 0.0001, t-test) or benign tumor controls (P = 0.0002, t-test). Serum transthyretin levels did not significantly differ between digestive disease and benign tumor controls (P = 0.6). CA125 concentration (<35 versus ≥35 units/mL) was a significant predictor of cancer status, although 25% of women with benign disease and 9% of controls also had levels ≥35 units/mL. Among cancer cases, most women (79%) were diagnosed with advanced-stage disease whereas 21% were diagnosed with stage I/II cancer. CA125 level did not differ by tumor stage or grade (data not shown).

The frequencies of ovarian cancer and benign tumor histologic subtypes in the final analysis are presented in Table 2. Serous cystadenocarcinoma was the major tumor histologic subtype in this study (∼40%). Benign ovarian tumor patients had tumors of both epithelial and stromal origin.

Table 2.

Histologic subtypes of ovarian benign disease and cancer

Histologic subtypes of benign tumorsN (%)Histologic subtypes of ovarian cancer casesN (%)
Adenofibroma, NOS 5 (7.7) Adenocarcinoma, NOS 7 (16.7) 
Brenner tumor, NOS 1 (1.5) Adenosquamous carcinoma 1 (2.4) 
Cystadenoma, NOS 1 (1.5) Brenner Tumor malignant 1 (2.4) 
Dermoid cyst, NOS 9 (13.8) Carcinoma, NOS 2 (4.8) 
Fibroma, NOS 5 (7.7) Carcinoma, anaplastic, NOS 1 (2.4) 
Leiomyoma, NOS 1 (1.5) Clear cell adenocarcinoma, NOS 1 (2.4) 
Mucinous cystadenoma, NOS 9 (13.8) Endometrioid carcinoma 3 (7.1) 
Papillary cystadenoma, NOS 1 (1.5) Mucinous cystadenocarcinoma, NOS 3 (7.1) 
Serous adenofibroma 5 (7.7) Mullerian mixed tumor 1 (2.4) 
Serous cystadenoma, NOS 8 (12.3) Papillary adenocarcinoma, NOS 3 (7.1) 
Teratoma, benign 16 (24.6) Papillary mucinous cystadenocarcinoma 2 (4.8) 
Thecoma 3 (4.6) Papillary serous cystadenocarcinoma 10 (23.8) 
Missing 1 (1.5) Serous cystadenocarcinoma, NOS 7 (16.7) 
Total 65 (100) Total 42 (100) 
Histologic subtypes of benign tumorsN (%)Histologic subtypes of ovarian cancer casesN (%)
Adenofibroma, NOS 5 (7.7) Adenocarcinoma, NOS 7 (16.7) 
Brenner tumor, NOS 1 (1.5) Adenosquamous carcinoma 1 (2.4) 
Cystadenoma, NOS 1 (1.5) Brenner Tumor malignant 1 (2.4) 
Dermoid cyst, NOS 9 (13.8) Carcinoma, NOS 2 (4.8) 
Fibroma, NOS 5 (7.7) Carcinoma, anaplastic, NOS 1 (2.4) 
Leiomyoma, NOS 1 (1.5) Clear cell adenocarcinoma, NOS 1 (2.4) 
Mucinous cystadenoma, NOS 9 (13.8) Endometrioid carcinoma 3 (7.1) 
Papillary cystadenoma, NOS 1 (1.5) Mucinous cystadenocarcinoma, NOS 3 (7.1) 
Serous adenofibroma 5 (7.7) Mullerian mixed tumor 1 (2.4) 
Serous cystadenoma, NOS 8 (12.3) Papillary adenocarcinoma, NOS 3 (7.1) 
Teratoma, benign 16 (24.6) Papillary mucinous cystadenocarcinoma 2 (4.8) 
Thecoma 3 (4.6) Papillary serous cystadenocarcinoma 10 (23.8) 
Missing 1 (1.5) Serous cystadenocarcinoma, NOS 7 (16.7) 
Total 65 (100) Total 42 (100) 

In Table 3, Spearman correlation coefficients between markers quantified with the chromatographic and immunobased assays are presented separately for ovarian cancer cases, benign ovarian tumor, and digestive disease controls. In each group, the rank of the transthyretin protein levels measured with the immunoassay was correlated with those of the posttranslationally modified forms of transthyretin (T2-T6) and apolipoprotein A1 biomarkers, with the exception of truncated transthyretin (T1) and CA125. CA125 levels were correlated with most biomarkers among cases, with the exception of markers T1 (correlation = 0.25, P = 0.11) and T3 (correlation = −0.36, P = 0.18). CA125 was not associated with levels of any other biomarker among benign tumor or digestive disease controls.

Table 3.

Correlation of transthyretin and CA125 immunoassay levels with apolipoprotein A1 and posttranslationally modified forms of transthyretin by disease status

CA125 level (units/mL), correlation coefficient* (P)
Transthyretin level (mg/mL), correlation coefficient* (P)
ControlBenignCaseControlBenignCase
CA125 (units/mL) 1.00 1.00 1.00 −0.09 (0.47) −0.06 (0.7) −0.53 (0.0002) 
Transthyretin (mg/mL) −0.09 (0.47) −0.06 (0.66) −0.53 (0.0002) 1.00 1.00 1.00 
T1-m/z 12,852 −0.14 (0.22) 0.07 (0.57) 0.25 (0.11) 0.18 (0.13) 0.15 (0.25) −0.11 (0.47) 
T2-m/z 13,773 −0.17 (0.23) 0.04 (0.76) −0.48 (0.001) 0.51 (<0.0001) 0.35 (0.005) 0.83 (<0.0001) 
T3-m/z 13,857 −0.14 (0.23) −0.17 (0.18) −0.36 (0.18) 0.55 (<0.0001) 0.46 (0.0002) 0.87 (<0.0001) 
T4-m/z 13,893 0.03 (0.81) 0.03 (0.81) −0.50 (0.0007) 0.47 (<0.0001) 0.34 (<0.0001) 0.86 (<0.0001) 
T5-m/z 13,933 0.03 (0.77) 0.03 (0.80) −0.42 (0.004) 0.49 (<0.0001) 0.49 (0.006) 0.83 (<0.0001) 
T6-m/z 14,111 0.09 (0.47) −0.04 (0.79) −0.45 (0.002) 0.52 (<0.0001) 0.51 (<0.0001) 0.90 (<0.0001) 
A1-m/z 28,107 −0.08 (0.49) −0.00 (0.97) −0.48 (0.001) 0.31 (0.006) 0.34 (0.005) 0.60 (<0.0001) 
CA125 level (units/mL), correlation coefficient* (P)
Transthyretin level (mg/mL), correlation coefficient* (P)
ControlBenignCaseControlBenignCase
CA125 (units/mL) 1.00 1.00 1.00 −0.09 (0.47) −0.06 (0.7) −0.53 (0.0002) 
Transthyretin (mg/mL) −0.09 (0.47) −0.06 (0.66) −0.53 (0.0002) 1.00 1.00 1.00 
T1-m/z 12,852 −0.14 (0.22) 0.07 (0.57) 0.25 (0.11) 0.18 (0.13) 0.15 (0.25) −0.11 (0.47) 
T2-m/z 13,773 −0.17 (0.23) 0.04 (0.76) −0.48 (0.001) 0.51 (<0.0001) 0.35 (0.005) 0.83 (<0.0001) 
T3-m/z 13,857 −0.14 (0.23) −0.17 (0.18) −0.36 (0.18) 0.55 (<0.0001) 0.46 (0.0002) 0.87 (<0.0001) 
T4-m/z 13,893 0.03 (0.81) 0.03 (0.81) −0.50 (0.0007) 0.47 (<0.0001) 0.34 (<0.0001) 0.86 (<0.0001) 
T5-m/z 13,933 0.03 (0.77) 0.03 (0.80) −0.42 (0.004) 0.49 (<0.0001) 0.49 (0.006) 0.83 (<0.0001) 
T6-m/z 14,111 0.09 (0.47) −0.04 (0.79) −0.45 (0.002) 0.52 (<0.0001) 0.51 (<0.0001) 0.90 (<0.0001) 
A1-m/z 28,107 −0.08 (0.49) −0.00 (0.97) −0.48 (0.001) 0.31 (0.006) 0.34 (0.005) 0.60 (<0.0001) 
*

Spearman rank correlation coefficient.

Immunoassay measurement.

In Table 4, the unadjusted least squares mean and SE estimates of serum protein levels are given separately for cancer cases, benign tumor, and digestive disease controls. Only CA125 and T1 were higher in cases compared with controls. Protein markers T2 to T6 and apolipoprotein A1 were lower in cases. The mean levels of CA125, total transthyretin, and protein markers T1 to T6 were significantly different between case, benign tumor, and control groups in a model adjusted for age quartile and storage time indicator. Protein markers T1 to T3 decreased and apolipoprotein A1 increased significantly with age of patient (data not shown). Serum CA125 and transthyretin protein levels were not associated with smoking status, age of patient, or sample storage time (data not shown).

Table 4.

Least squares mean estimates for patient status by cancer marker

BiomarkersUnadjusted model mean (SD)
P-adjusted* model
ControlBenignCase
CA125*(units/mL) 20.5 (19.3) 29.9 (32.7) 1,563.9 (3,350.0) <0.0001 
Transthyretin* (mg/mL) 294.8 (78.6) 287.1 (75.8) 221.9 (89.1) <0.0001 
Transthyretin (m/z    
    T1-12,852*, 2.8 (0.8) 2.8 (0.6) 3.6 (1.5) <0.0001 
    T2-13,773*,, 9.3 (4.3) 9.6 (4.2) 7.3 (4.3) 0.001 
    T3-13,857*, 5.6 (1.8) 6.1 (1.8) 4.5 (2.2) 0.002 
    T4-13,893*, 18.8 (6.2) 17.0 (6.2) 15.5 (8.3) <0.0001 
    T5-13,933* 6.8 (2.3) 7.2 (2.9) 5.4 (2.6) 0.0001 
    T6-14,111* 4.4 (0.9) 4.5 (1.0) 3.7 (1.5) <0.0001 
Apolipoprotein A1 (m/z    
    A1-28,107 9.3 (4.3) 8.2 (3.3) 8.3 (3.1) 0.32 
BiomarkersUnadjusted model mean (SD)
P-adjusted* model
ControlBenignCase
CA125*(units/mL) 20.5 (19.3) 29.9 (32.7) 1,563.9 (3,350.0) <0.0001 
Transthyretin* (mg/mL) 294.8 (78.6) 287.1 (75.8) 221.9 (89.1) <0.0001 
Transthyretin (m/z    
    T1-12,852*, 2.8 (0.8) 2.8 (0.6) 3.6 (1.5) <0.0001 
    T2-13,773*,, 9.3 (4.3) 9.6 (4.2) 7.3 (4.3) 0.001 
    T3-13,857*, 5.6 (1.8) 6.1 (1.8) 4.5 (2.2) 0.002 
    T4-13,893*, 18.8 (6.2) 17.0 (6.2) 15.5 (8.3) <0.0001 
    T5-13,933* 6.8 (2.3) 7.2 (2.9) 5.4 (2.6) 0.0001 
    T6-14,111* 4.4 (0.9) 4.5 (1.0) 3.7 (1.5) <0.0001 
Apolipoprotein A1 (m/z    
    A1-28,107 9.3 (4.3) 8.2 (3.3) 8.3 (3.1) 0.32 
*

Significance of group differences in model adjusted for age quartile and storage time indicator variable.

Significantly associated with age in adjusted model P < 0.05.

Significantly associated with storage time indicator variable in adjusted model P <0.05.

In Table 5, the cross-validated sensitivity and specificity estimates for three prediction models are presented for the K-nearest neighbor algorithm with K = 2. Model 1 included age of patient in quartiles and an indicator variable for CA125 (<35 versus ≥35 units/mL). Model 2 included age of patient in quartiles and protein biomarker levels measured using the chromatographic assay. Model 3 was the most comprehensive and included age in quartiles, the CA125 indicator variable, and all protein biomarkers measured with the chromatographic assay. Sensitivity and specificity estimates to predict cancer/noncancer status were done by combining all noncancer subgroups. Models 1 and 3 were the most sensitive to predict cancer/noncancer status [73.8% (95% CI, 58.0-86.1%) and 78.6% (95% CI, 63.2-89.7%), respectively]. Model 2 had the lowest sensitivity (52.4%; 95% CI, 36.4-68.0%). The overall specificity of model 2 that included the transthyretin and apolipoprotein A1 markers without CA125 was as high as models 1 and 3 that included CA125. When sensitivity was determined in early- and late-stage cases separately, two additional early-stage cases were identified by combining all markers (model 3). Model 2 did poorly in late-stage cases but improved with the addition of CA125 to the model. The specificity of all three models was similarly high for benign tumor controls. Among controls with CA125 levels ≥35 units/mL, 15 cases were identified in model 1 by the addition of age to the model. The specificity of model 2 was higher than models 1 and 3 among this subgroup of controls.

Table 5.

Cross-validated sensitivity and specificity estimates for various prediction models using the K-nearest neighbor algorithm with K = 2 (proteomic markers, when included into a model, were log transformed)

Sensitivity and specificity (95% CI*) to discriminate cancer from noncancer
Sensitivity and specificity (95% CI*) by subgroup
Total sensitivityTotal specificityStage I/II sensitivityStage III/IV specificitySpecificity benign tumorsSensitivity CA125 ≥35 units/mL
Model 1 31/42 (73.8%; 58.0-86.1%) 133/141 (94.3%; 89.1-97.5%) 3/9 (33.3%; 7.5-70.1%) 28/33 (84.9%; 68.1-94.9%) 60/65 (92.3%; 83.0-97.5%) 15/23 (65.2%; 42.7-83.6%) 
Model 2 22/42 (52.4%; 36.4-68.0%) 136/141 (96.5%; 91.9-98.8%) 3/9 (33.3%; 7.5-70.1%) 19/33 (57.6%; 39.2-74.5%) 62/65 (95.4%; 87.1-99.0%) 21/23 (91.3%; 72.0-98.9%) 
Model 3§ 33/42 (78.6%; 63.2-89.7%) 133/141 (94.3%; 89.1-97.5%) 5/9 (55.6%; 21.2-86.3%) 28/33 (84.9%; 68.1-94.9%) 59/65 (90.8%; 81.0-96.5%) 15/23 (65.2%; 42.7-83.6%) 
Sensitivity and specificity (95% CI*) to discriminate cancer from noncancer
Sensitivity and specificity (95% CI*) by subgroup
Total sensitivityTotal specificityStage I/II sensitivityStage III/IV specificitySpecificity benign tumorsSensitivity CA125 ≥35 units/mL
Model 1 31/42 (73.8%; 58.0-86.1%) 133/141 (94.3%; 89.1-97.5%) 3/9 (33.3%; 7.5-70.1%) 28/33 (84.9%; 68.1-94.9%) 60/65 (92.3%; 83.0-97.5%) 15/23 (65.2%; 42.7-83.6%) 
Model 2 22/42 (52.4%; 36.4-68.0%) 136/141 (96.5%; 91.9-98.8%) 3/9 (33.3%; 7.5-70.1%) 19/33 (57.6%; 39.2-74.5%) 62/65 (95.4%; 87.1-99.0%) 21/23 (91.3%; 72.0-98.9%) 
Model 3§ 33/42 (78.6%; 63.2-89.7%) 133/141 (94.3%; 89.1-97.5%) 5/9 (55.6%; 21.2-86.3%) 28/33 (84.9%; 68.1-94.9%) 59/65 (90.8%; 81.0-96.5%) 15/23 (65.2%; 42.7-83.6%) 

Abbreviation: ApoA1, apolipoprotein A1.

*

Exact binomial confidence intervals.

Model 1: CA125 <35, ≥35, and age in quartiles.

Model 2: all proteomic markers (T1-T6, ApoA1) and age in quartiles.

§

Model 3: CA125 <35, ≥35, all proteomic markers (T1-T6, ApoA1), and age in quartiles.

The primary purpose of this work was to reproduce findings presented in Zhang et al. (2) in an independent study population. As in the previous report, differential analysis of protein profiles was conducted using serum samples collected postdiagnosis/pretreatment from ovarian cancer patients and women with benign ovarian tumors. When examined individually, levels of posttranslationally modified forms of transthyretin and apolipoprotein A1 were significantly lower in cancer cases compared with controls. In an ANOVA adjusted for age of patient and draw year, all biomarkers except apolipoprotein A1 were significantly different in cases than benign ovarian tumor or digestive disease controls. When these markers were combined into different prediction models, we found that the specificity of model 2, which included only the posttranslationally modified forms of transthyretin and apolipoprotein A1, was as high as model 3, which included age of patient, markers, and CA125 (96.5%; 95% CI, 91.9-98.8%), but sensitivity was lower [52.4% (95% CI, 36.4-68.0%) versus 73.8% (95% CI, 58.0-86.1%), respectively]. The previous study (2) had also reported that these markers primarily improved specificity with only marginal increased sensitivity in multivariable models. Unlike Zhang et al., who compared women with ovarian cancer and healthy controls, this study evaluated the discriminating power of the markers to discriminate sera from women with ovarian cancer from sera from women with other diseases, including benign ovarian tumors and gastrointestinal disorders. In our study, 25% of women with benign disease and 9% of controls also had CA125 levels ≥35 units/mL, higher than the previous study (2).

Previous studies have reported differences in both transthyretin protein and apolipoprotein A1 among ovarian cancer cases compared with controls; however, these proteins are not specific to ovarian cancer. Serum transthyretin levels decrease rapidly in response to malnutrition and injury and elevated levels have been observed in patients taking medications and with Hodgkin's disease (18). Apolipoprotein levels reflect HDL protein concentration differences observed among healthy controls but are also influenced by other possible confounders such as age, alcohol intake, hormone use, sex, race, body mass index, and coronary artery disease (19, 20). CA125 protein is often elevated during menstruation and pregnancy, with endometriosis, and decreases with age (21). Like transthyretin and apolipoprotein A1, other biomarkers identified through “omic” technologies that are potentially able to differentiate healthy controls from ovarian cancer patients are also found in controls, associated with other disease states, and are considered to be markers of the host metabolic response to cancer rather than specific to the cancer itself (2, 4, 11, 22-24). The major challenge of developing an ovarian cancer detection test is that it must be highly specific to avoid detection of numerous false positives. This requirement and the fact that many markers are not specific to ovarian cancer underscore the need to understand how they are associated with sample processing and handling and other patient characteristics, in addition to their significance in pathogenesis. In this study, we examined associations between all markers and available information on subjects including age of patient at diagnosis, smoking status, ICD disease code, as well as draw date and number of freeze thaws per sample. It was necessary to include age quartile in all models because it was positively associated with apolipoprotein A1, inversely associated with T1 to T3, and because the women diagnosed with benign cysts were younger than cancer cases or digestive disease controls. By eliminating age from the models, the specificity of all three models to detect controls with benign cysts would be lower (Table 5). When used alone, apolipoprotein A1 and transthyretin markers were useful to detect controls with CA125 levels ≥35 units/mL but poor at identifying cases with late-stage cancers. The addition CA125 as a dichotomous variable to the model increased sensitivity to detect high-stage cancers; however, the added specificity was lost.

This is one of the first studies to independently evaluate findings previously reported in ovarian cancer early detection using the SELDI-TOF-MS platform. Two recent editorials pointed to the important role of assessing potential sources of bias in design and conduct and also the effect of chance on data interpretations (7, 25). Several important lessons have been learned through the design and analysis of this study that might be considered in future investigations to reduce ongoing concerns over data validity and improve chances of reproducibility: (a) Convenience samples should be avoided for biomarker discovery or validation until the relationship between the biomarker and biases in sample handling, processing, storage, and common confounding variables is understood. Ignoring such factors in study design and analysis will decrease any chance of future validation. In contrast to common practice in early phases of biomarker identification and validation, we promote using prospectively collected samples and including comparison groups that are frequency matched to cases and well characterized for possible confounders. Such samples are not generally easy to obtain. (b) Inclusion of a clinically relevant comparison group, in addition to healthy population controls, may be needed to improve understanding of the biomarker performance in future testing populations. (c) When relationships between biomarkers and potential confounders are not well understood, it may be advisable to restrict the samples rather than attempt to eliminate biases by statistical modeling. In biomarker discovery studies, many questions can be addressed in small carefully designed studies to minimize possible sources of variability. Once biomarkers are identified, control samples can be used to identify relationships with possible confounding variables. Whereas ideally one restricts samples in the design phase, sometimes it is necessary to contend with them in the analysis. The major factor biasing our results was length of sample storage time. Because the digestive disease control serum samples were stored longer than those from ovarian cancer cases or benign ovarian tumor controls, the protein levels of two markers would have decreased with storage time, biasing the comparisons between cases and digestive disease controls toward the null. We thus restricted the analysis to samples collected 1983 or later, eliminating the storage artifact for all markers except T4. Analyses comparing benign ovarian tumor and cancer cases were not susceptible to this bias because samples from both groups were collected and stored over a similar time period.

In conclusion, the specificity of a model that included measurements of posttranslationally modified forms of transthyretin, apolipoprotein A1, and age of patient was as high as models that included CA125 and age but sensitivity was lower. When used alone, these markers improved detection of controls with CA125 levels ≥35 units/mL but lost sensitivity to identify late-stage cases. In general, influences of sample storage conditions, subject characteristics, and other covariates on biomarker levels require further consideration in discovery and replication studies.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1
Petricoin EF, Ardekani AM, Hitt BA, et al. Use of proteomic patterns in serum to identify ovarian cancer.
Lancet
2002
;
359
:
572
–7.
2
Zhang Z, Bast RC, Jr., Yu Y, et al. Three biomarkers identified from serum proteomic analysis for the detection of early stage ovarian cancer.
Cancer Res
2004
;
64
:
5882
–90.
3
Kozak KR, Amneus MW, Pusey SM, et al. Identification of biomarkers for ovarian cancer using strong anion-exchange ProteinChips: potential use in diagnosis and prognosis.
Proc Natl Acad Sci U S A
2003
;
100
:
12343
–8.
4
Mor G, Visintin I, Lai Y, et al. Serum protein markers for early detection of ovarian cancer.
Proc Natl Acad Sci U S A
2005
;
102
:
7677
–82.
5
Cannistra SA. Cancer of the ovary.
N Engl J Med
2004
;
351
:
2519
–29.
6
Coombes KR, Morris JS, Hu J, Edmonson SR, Baggerly KA. Serum proteomics profiling—a young technology begins to mature.
Nat Biotechnol
2005
;
23
:
291
–2.
7
Ransohoff DF. Lessons from controversy: ovarian cancer screening and serum proteomics.
J Natl Cancer Inst
2005
;
97
:
315
–9.
8
Diamandis EP. Proteomic patterns to identify ovarian cancer: 3 years on.
Expert Rev Mol Diagn
2004
;
4
:
575
–7.
9
Baggerly KA, Morris JS, Edmonson SR, Coombes KR. Signal in noise: evaluating reported reproducibility of serum proteomic tests for ovariancancer.
J Natl Cancer Inst
2005
;
97
:
307
–9.
10
DiMagno EP Corle D, O'Brien JF, Masnyk IJ, Go VLW, Aamodt R. Effect of long-term freezer storage, thawing, and refreezing on selected constituents of serum.
Mayo Clin Proc
1989
;
64
:
1226
–34.
11
Fung ET, Yip TT, Lomas L, et al. Classification of cancer types by measuring variants of host response proteins using SELDI serum assays.
Int J Cancer
2005
;
115
:
783
–9.
12
SAS Users Guide. Version 9, Cary (NC): SAS Inst; 1999.
13
Breiman L. Random forests.
Machine Learning
2001
;
45
:
5
–32.
14
Wu BL, Abbott T, Fishman D, et al. Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data.
Bioinformatics
2003
;
19
:
1636
–43.
15
Breiman L, Spector P. Submodel selection and evaluation in regression. The X-random case.
Int Stat Rev
1992
;
60
:
291
–319.
16
Dudoit S, van der Laan MJ. Asymptotics of cross-validated risk estimation in model selection and performance assessment. Technical Report 126, U.C. Berkeley Division of Biostatistics Working Paper Series. 2003; URL http://www.bepress.com/ucbbiostat/paper126.
17
Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. International Joint Conference on Artificial Intelligence; 1995. Available from: http://www.starry.stanford.edu:pub/ronnyk/accEst-long.ps.
18
Bernstien LH, Ingenbleek Y. Tranthyretin: its response to malnutrition and stress injury. Clinical usefulness and economic implications.
Clin Chem Lab Med
2002
;
40
:
1344
–8.
19
Jungner I, Marcovina SM, Walldius G, Holme I, Kolar W, Steiner E. Apolipoprotein B and A-1 values in 147,576 Swedish males and females, standardized according to the World Health Organization-International Federation of Clinical Chemistry First International Reference Materials.
Clin Chem
1998
;
44
:
1641
–9.
20
Bachorik PS, Lovejoy KL, Carroll MD, Johnson CL. Apolipoprotein B and A1 distributions in the United States, 1998-1991: results of the National Health and Nutrition Examination Survey III (NHANES II).
Clin Chem
1997
;
43
:
2364
–78.
22
Mahlck CG, Grankvist K. Plasma prealbumin in women with epithelial ovarian carcinoma.
Gynecol Obstet Invest
1994
;
37
:
135
–40.
23
Kuesel AC, Kroft T, Prefontaine M, Smith IC. Lipoprotein(a) and CA125 levels in the plasma of patients with benign and malignant ovarian disease.
Int J Cancer
1992
;
52
:
341
–4.
24
Ye B, Cramer DW, Skates SJ, et al. Haptoglobin-α subunit as potential serum biomarker in ovarian cancer: identification and characterization using proteomic profiling and mass spectrometry.
25
Hu J, Coombes KR, Morris JS, Baggerly KA. The importance of experimental design in proteomic mass spectrometry experiments: some cautionary tales.
Brief Funct Genomic Proteomic
2005
;
3
:
322
–31.