Purpose: To establish a comprehensive proteomic approach for biomarker discovery and validation in breast fluid.
Experimental Design: A total of 95 specimens from three institutions were used including 10 nipple aspiration fluid (5 stage I/II cancerous breasts and 5 age-matched healthy controls), 42 ductal lavage fluid from 14 patients with unilateral stage I/II cancer (25 from 9 cancerous breasts and 17 from 7 contralateral breasts), and 42 ductal lavage fluid from 14 high-risk women (multiple ducts repeated lavage). Differentially expressed protein/peptides were discovered by proteomic analysis of training sample, using ProteinChip arrays and surface-enhanced laser desorption ionization (SELDI) time-of-flight mass spectrometry, and validated on independently collected testing samples. After protein identification, ELISA was done to confirm the SELDI findings.
Results: We were able to obtain reproducible protein profiles using minimal amount of protein (1 μg) by applying an optimized chip protocol and SELDI. We were able to select cancer-associated biomarkers despite large individual variability by applying both unsupervised and supervised cluster analysis. Furthermore, we were able to train and test candidate biomarkers on independently collected samples and identified one component of a multimarker panel as human neutrophil peptides 1 to 3.
Conclusions: Breast fluid is a rich source of breast cancer biomarkers. In combination with high-throughput novel proteomic profiling technology and multicenter study design, markers that are highly specific to breast cancer can be discovered and validated. Our observations also suggest that persistent elevation of human neutrophil peptide in high-risk women may imply early onset of cancer not yet detectable by current detection method. Proof of this hypothesis requires follow-up on a larger study population.
Breast cancer is the most commonly diagnosed cancer among women. Presymptomatic screening to detect early-stage breast cancer while it is still resectable could potentially reduce breast cancer-related mortality. Unfortunately, only 63% (1992-1999, United States) of the breast cancers are localized at the time of diagnosis (1). Small lesions are frequently missed and may not be visible even by mammography, particularly in young women and women with dense breast tissue (2). Molecular markers that can potentially identify these small lesions that are invisible to imaging techniques will provide a real opportunity to treat a neoplasm before it invades the tissue.
Breast cancer is highly heterogeneous. Most molecularly based approaches that have been investigated for the early detection of breast cancer are targeted at specific factors, such as oncogenes, tumor suppressor genes, growth factors, tumor antigens, or other gene products. The inherent problem is that none of these factors alone can account for a large majority of the breast cancers and some are not specific to cancer or breast tissues; thus, the sensitivity and specificity of such approaches is low. Thus far, no molecular biomarkers are recommended for the early detection of breast cancer (3).
The human mammary gland is composed of discrete ductal-alveolar systems that originate at the nipple and branch through the surrounding stroma toward the chest wall. Most breast carcinomas (70-80%) are thought to arise from the epithelial cells lining the terminal ducts of these structures. The breast epithelium exfoliates cells as a renewal of tissue and secretes fluids into the luminal compartment of the gland. These fluids exit each breast through six to nine separate orifices at the nipple and can be collected using either of the two noninvasive procedures: nipple aspiration and ductal lavage. In nipple aspiration, a simple handheld suction cup is placed on the nipple and used to quickly obtain concentrated fluid droplets at nipple openings. These droplets were collected with capillary tubes. This technique is successful in most women (4) and the yield typically varies from several microliters to 100 μL (5, 6). As nipple aspiration fluid (NAF) comes only from the immediate vicinity of the nipple and the yield of which is unpredictable, a ductal lavage system has been devised. This method involves suction of the nipple to localize NAF-yielding duct(s). NAF-producing duct(s) can then be cannulated using a microcatheter and lavaged with saline. Ductal lavage fluid (DLF) may provide a better source of cells and proteins released from the tumor because it represents washes from the entire length of the duct.
Compared with serum, breast fluids potentially offer a superior source of biomarkers for breast cancer because the proteins present are specifically released from breast tissue. Therefore, it would be beneficial to screen the breast fluid from a large patient cohort for a multiple protein panel that can identify majority of the breast cancer cases. However, due to the limitations on specimen resource and protein yield, proteomic study of this important body fluid remains limited. Furthermore, among the few studies reported to date, only differentially expressed protein spots (by two-dimensional gel analysis) or peaks (by mass spectrometry) have been reported (7–11). The strength of these entities remains weak due to the lack of validations and the lack of protein identifications. In this study, we have recruited specimens from multiple centers (NAF or DLF, n = 95) and established a comprehensive proteomic approach for biomarker screening, detection, and validation. The basic design of this approach is as follows: (a) proteomic profiling using protein chip arrays and mass spectrometry; (b) biomarker discovery using a combination of bioinformatics tools, unsupervised cluster analysis to recognize potential patients' subgroups, and supervised cluster analysis for biomarker selection within each subgroup; (c) validation of results using independent samples; (d) protein identification of the potential biomarkers; and (e) additional validation using a quantitative immunoassay.
Materials and Methods
Multicenter study design
To minimize the potential biases on patient selection, as well as fluid collection procedures at each institution, and to maximize the applicability of the discovery, we recruited both nipple aspiration and DLF specimens from three institutions, the University of Texas M.D. Anderson Cancer Center, the Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins Hospital, and the Bluhm Family Program for Breast Cancer Early Detection and Prevention, Northwestern University. Specimen collection at each site was approved by the respective Institutional Review Board and the current proteomic study was approved by the Institutional Review Board of Johns Hopkins University.
Nipple aspiration fluid
Patients. NAF specimens were obtained from the University of Texas M.D. Anderson Cancer Center. Patients who presented with biopsy-proven stage I or II unilateral primary invasive breast carcinoma were eligible for bilateral nipple aspiration. Patients were excluded from participation if they had previously undergone subareolar surgery that might have disrupted the terminal ductal system. Individuals were also eligible to participate if they were >40 years of age and had no evidence of breast disease or breast cancer as evidenced by normal findings on physical examination and breast imaging. Ten fluid samples were available for this study: five were from the cancerous breast of breast cancer patients and the other five were from the breast of healthy donors.
Collection procedure. Ductal fluid was collected by nipple aspiration using a handheld suction cup similar to nonpowered breast pumps used to express milk from lactating women. This device consists of a plastic cup connected to a section of polymer tubing. The tubing is attached to a standard syringe that is used to create a gentle vacuum. This device was originally used and described by Sartorius et al. (4) and was purchased for the collection of NAF from CYTYC Health Corp. (Boxborough, MA).
Before aspiration was attempted, the nipple was cleansed with a small amount of Omniprep paste (D.O. Weaver & Co., Aurora, CO) to remove keratin plugs and then cleansed with an alcohol pad. A small amount of lotion was placed on the breast and the breast was gently massaged from the chest wall towards the nipple for 1 minute. The suction cup was then placed over the nipple and the plunger of the syringe was withdrawn to the 5- to 10-mL level until ductal fluid was visualized. The fluid droplets were collected into a 10-μL graduated micropipette (Drummond Scientific Co., Broomall, PA). Samples were obtained from both breasts and the presence of NAF and volumes of NAF obtained were recorded for each patient and each breast.
Immediately after collection, the NAF samples were rinsed into centrifuge tubes containing 500 μL of sterile PBS supplemented with protease inhibitors 4-[2-aminoethyl]-benzenesulfonylfluoride-HCl (0.2 mmol/L), leupeptin (50 g/mL), aprotinin (2 g/mL), and DTT (0.5 mmol/L). The samples were then centrifuged at 1,500 rpm for 10 minutes to remove insoluble materials and the supernatant was collected in 50-μL aliquots.
Ductal lavage fluids
Patients. Two institutions contributed DLF samples to this study.
In a Department of Defense–sponsored clinical trial at Johns Hopkins Hospital, women with stage I or II biopsy-proven unilateral primary breast cancer were eligible for breast fluid collection before surgery. DLF from one to three ducts were collected individually from the patient's cancer-bearing and contralateral “normal” breasts. At the time of the study, 42 specimens were available, 25 were from 9 breasts with cancer and 17 were from 7 contralateral breasts, and these specimens were from a total of 14 patients.
In a separate trial at Northwestern Memorial Hospital, women who were at increased risk because of a 5-year Gail risk estimate >1.6% or history of lobular carcinoma in situ were recruited for tamoxifen treatment. These women underwent ductal lavage at entry, made a decision for or against tamoxifen use for breast cancer prevention, and underwent repeat ductal lavage 6 to 12 months later. Forty-two DLF specimens from 14 high-risk women who chose not to take tamoxifen were available for this study.
Collection procedure. Nipple aspiration was first done to identify fluid-yielding duct. If nipple fluid was seen, the fluid-yielding duct was cannulated with a single lumen microcatheter and lavaged with normal saline. Local anesthesia was provided before cannulation with periareolar infiltration with ∼5 mL of 1% lidocaine and the ductal tree was anesthetized by instilling 2 to 3 mL of 1% lidocaine through the nipple duct sphincter as soon as the tip of the catheter was introduced into the duct orifice. Intermittent breast massage was done after the instillation with ∼2 mL of saline and this process was repeated four to five times so that the total instilled volume was ∼10 to 20 mL. The location of the fluid-yielding and cannulated ducts was recorded on an 8 × 8 grid and photographed after inserting a prolene suture to facilitate recannulation at a later date in the Northwestern trial.
Immediately after collection, the samples were centrifuged at 1,500 rpm for 10 minutes to remove cells and insoluble materials and the supernatant was collected for subsequent proteomic analysis.
Specimen preparation for proteomic analysis
All specimens were stored at −80°C after collection and frozen aliquots were shipped to Johns Hopkins on dry ice. No further processing was required for the NAF specimens received whereas DLF was first lyophilized and then dialyzed overnight against PBS to remove excess saline using Tube-O-Dialyzer with 1 kDa molecular weight cutoff (Upstate, Charlottesville, VA). Protein concentration in each fluid was measured using the bicinchoninic acid protein assay kit (Pierce, Rockford, IL).
Surface-enhanced laser desorption ionization mass spectrometry analysis
Various proteomic chip chemistries (hydrophobic, anionic, cationic, and metal affinity) were initially evaluated to determine which affinity chemistry provided the best profiles in terms of number and resolution of proteins. The Immobilized Metal Affinity Capture chip arrays (IMAC30) were selected. The active spots on IMAC30 contain nitrilotriacetic acid groups that chelate metal ions. Proteins bind to the chelated metal on IMAC30 arrays through histidine, tryptophan, cysteine, or phosphorylated amino acids. Minimal amount of fluid protein required for each analysis and binding and washing conditions were also tested for optimal protein presentation. Briefly, IMAC30 chip arrays were pretreated with CuSO4 using a 96-well format bioprocessor (hold 12 eight-spot chips) following the instruction of the manufacturer (Ciphergen Biosystems, Fremont, CA). After incorporation of Cu2+ onto the chip surface, the bioprocessor was dissembled to release the chips. Various volumes (1-15 μL) of breast fluid samples containing 1 μg of protein were applied directly onto the pretreated spot and allowed to air-dry at room temperature. Allocation of specimens on protein chip arrays was randomized, including the triplicates of the same sample. The bioprocessor was reassembled after sample application and washed twice with 100 μL of PBS for 5 minutes followed by two quick rinses with 100 μL of dH2O to remove loosely bound materials. After air-drying, 0.5 μL of saturated sinapinic acid prepared in 50% acetonitrile, 0.5% trifluoroacetic acid was applied twice to each spot as the energy absorbing molecules. Proteins bound to the chip surfaces were detected using a PBS-II ProteinChip Reader (Ciphergen Biosystems). An automated analytic protocol was used to control the data acquisition process. Each spectrum was an average of 80 laser shots and externally calibrated against a mixture of known peptides or proteins. Molecular weight determination error was 0.05%.
The data analysis process used in this study involved the following steps: (a) Peak detection. ProteinChip Software 3.0 (Ciphergen Biosystems) was used to collect and evaluate the raw spectra. All mass spectra were compiled and baseline was subtracted. Qualified mass peaks (visual examination) with signal/noise > 5 were manually selected and the peak intensities were normalized to the total ion current of the selected mass region. In this case, it was between 3 and 135 kDa. The peak intensities at each mass/charge (m/z) identified in triplicate analysis were averaged and then log transformed for subsequent analysis. (b) Biomarker selection using training data. We have used the peak intensity data obtained from 10 NAF specimens as the training data for selection of candidate biomarkers. Due to the individual variability of the mass spectra obtained (visual inspection), an unsupervised cluster analysis (MATLAB) was first done to recognize any potential patient subclasses based on general protein expression patterns. Two clusters were observed: one consists of specimens C6, C14, N32, N33, and N36 (group A) and the other consists of specimens C11, C16, C26, N4, and N15 (group B). A supervised cluster analysis was then done within each subgroup using ProPeak (3Z Informatics, Charleston, SC) and biomarkers that can effectively separate the cancer and noncancer data were selected. ProPeak implements the linear version of the Unified Maximum Separability Analysis algorithm that was first reported for use in microarray data analysis (7). Application of ProPeak in surface-enhanced laser desorption ionization (SELDI) protein array data analysis was described in detail previously (8). Briefly, each specimen was analyzed and projected as an individual point onto a three-dimensional component space where location of each point was determined by linear regression derived composite index using peak intensity data. The rank of each peak represents its contribution towards the maximal separation of the cancer and noncancer specimens. In this case, we visually inspected the peaks with high discriminatory power and selected five peaks (three peaks in group A and two peaks in group B) that are elevated in cancer for further evaluation. (c) Biomarker validation using independent testing data. The validity of the potential biomarkers was tested on DLF specimens collected at the Johns Hopkins Hospital. Of the 42 1-mL DLF aliquots available for this study, 24 yielded more than 3 μg of protein needed for subsequent SELDI analysis. Equal amounts of protein from 11 DLF from cancerous breasts and 13 DLF from 7 noncancerous breasts, respectively, were pooled to represent the cancer and noncancer breasts. Protein profiles of the pooled specimens were generated in an independent experiment using the same chip protocol as described for the analysis of NAF.
Surface-enhanced laser desorption ionization time-of-flight mass spectrometry immunocapture of BF1 to 3
Immunocapture was done using affinity-purified rabbit antibody against a 16-amino-acid peptide common to human neutrophil peptides (HNP) 1 to 3 (Alpha Diagnostics, San Antonio, TX). The antibody is linked to AminoLink beads using AminoLink Plus Immobilization Kit (Pierce) following the instructions of the manufacturer. Two NAF specimens with high BF1 to 3 (C4 and C14) were used in the capture experiment and the captured peptides were analyzed on IMAC-Cu protein chip arrays as previously described.
Quantitative measurement of human neutrophil peptides 1 to 3 by ELISA
Level of HNP1 to 3 was measured using a sandwiched solid-phase ELISA. The kit was a product of HyCult Biotechnology (Uden, the Netherlands; distributed by Cell Sciences, Canton, MA). Each sample was diluted (a pre-experiment was done to determine the proper dilution factor for each sample) and measured in duplicates.
Proteomic profiling of breast fluids. Using our optimized chip protocol, we were able to obtain reproducible protein profiles using breast fluid samples containing 1 μg of total protein. Figure 1 shows a pseudo-gel view of protein profiles of NAF from the cancerous breasts of five patients with primary invasive cancer (C6, C11, C14, C16, and C26) and from breasts of five normal controls (N4, N15, N32, N33, and N36). General protein expression profiles of different individuals are variable whereas mass spectra of triplicate analysis of the same specimen are highly reproducible. Ion signals of m/z < 3,000 are mainly noise of the matrix material and the largest m/z detected was at 135,000. We have manually selected 73 protein peaks (signal/noise > 5; m/z 3,000-135,000) for subsequent biomarker evaluation.
Biomarker selection using training data. The 10 NAF specimens were used as the training sample. First, we did unsupervised cluster analysis to recognize any potential patient subclasses based on their protein expression data. Two clusters were formed: cluster A consisted of specimens C6, C14, N32, N33, and N36; cluster B consisted of specimens C11, C16, C26, N4, and N15 (Fig. 2).
To select biomarkers that are discriminatory within each subgroup, a subsequent supervised analysis was done using ProPeak. The peaks were ranked based on their contribution towards the maximal separation of the cancer and noncancer specimens within each group, and we selected five peaks (three from group A and two from group B) with the highest discriminatory power for further evaluation. The m/z values of the five selected peaks are 3,375 (BF1), 3,447 (BF2), 3,490 (BF3), 4,079 (BF4), and 4,680 (BF5; Fig. 2, arrows). BF1 to 3 appear as a cluster of three peaks and were elevated in C6 and C14, selected as the most effective discriminators in group A. BF4 was elevated in C11 and C16 and BF5 was elevated in C26. These two markers collectively can discriminate cancer versus noncancer specimens in group B. Collectively, a minimum of three peaks (BF1/2/3, BF4, and BF5) is needed to classify all five cancer cases correctly.
Biomarker validation using independent testing data. The validities of BF1 to 5 were tested on a pair of pooled DLF specimens originated from samples collected at Johns Hopkins Hospital. Of the 42 1-mL DLF aliquots available, 24 yielded more than 3 μg of protein needed for SELDI analysis. These 24 samples included 11 DLF from 9 breasts with cancer and 13 DLF from 7 contralateral cancer-free breasts. Because only one duct of each cancerous breast harbors the tumor, and not all ducts were sampled, fluid-yielding ducts from the cancerous breast may not necessarily include the duct with the tumor. Initially, we planned to use cytology as the gold standard for identification of cancer ducts but failed to do so due to lack of cells (<10 cells) in the majority of our samples (33 of 42). To increase the probability of getting true positive and true negative specimens representative of the cancer and noncancer ducts, we created a pair of pooled DLF specimens by pooling equal amounts of protein from all 11 ducts of the 9 cancerous breasts (DLF-C) and the 13 ducts of the 7 noncancerous breasts (DLF-N), respectively. As shown in Fig. 2C, the general protein expression pattern of the pooled DLF samples resembles NAF of group A and elevations of BF1 to 3 and BF5 in cancer in comparison with the noncancer controls were confirmed. It should be noted that elevations of BF1 to 3 and BF5 were previously observed in either group A or B; the pooled DLF samples therefore present features of both subgroups. The peak corresponding to BF4 was absent in the pooled DLF specimens; the validity of this marker therefore remains unverified.
BF1 to 3 were confirmed to be human neutrophil peptides 1 to 3. By searching through protein databases (National Center for Biotechnology Information: http://www.ncbi.nlm.nih.gov; Swiss-Prot: http://www.ebi.ac.uk/swissport), we found that BF1 to 3 of molecular weights 3,375 (BF1), 3,447 (BF2), and 3,490 (BF3) correspond to the molecular masses of human neutrophil peptide 1 to 3 (HNP1-3). HNP1 to 3 are peptide antibiotics made principally by human neutrophils although some tumors might also produce HNP1 to 3 with the same capabilities. Besides their diverse functional activities in innate antimicrobial immunity, recent studies have also implicated its effect on tumor cell proliferation (see Discussion and references therein).
The identity of BF1 to 3 as HNP1 to 3 was verified by SELDI time-of-flight mass spectrometry immunocapture assay using a monoclonal antibody against HNP1 to 3. The antibody was amino-linked to bead and incubated with two NAF specimens with high BF1 to 3 peaks (NAF-C6 and NAF-C14). Original mass spectra of NAF-C6 and NAF-C14, along with spectra of the captured proteins from each of the two samples, were shown in Fig. 3. Three peptides were captured by the antibody and showed the exact same molecular weights and expression pattern as BF1 to 3 (note the relative intensity of BF3 in relation to BF1 and 2 in NAF-C14).
Level of human neutrophil peptides 1 to 3 measured by quantitative immunoassay validated the surface-enhanced laser desorption ionization findings. Elevation of HNP1 to 3 in NAF-C4 and NAF-C14 was further confirmed by quantitative analysis of HNP1 to 3 by ELISA. High peak amplitude of BF1 to 3 correlated with high level of HNP1 to 3 measured by ELISA (Fig. 4). The concentration of HNP was 11,905 ng/mL in C6 and 8,816 ng/mL in C14, >50-fold higher than the mean value of 172 ng/mL (range, 19-643 ng/mL) in the normal controls.
To investigate whether or not elevated HNP level in breast fluid was due to contamination of blood, HNP1 to 3 were also measured in a commercial pooled standard serum sample, as well as 20 banked serum samples of 4 apparently healthy women, 4 women with benign breast disease, 4 with ductal carcinoma in situ, and 4 with invasive breast cancer. Level of HNP1 to 3 in the pooled commercial serum was determined as 41 ng/mL (value plotted in Fig. 4). HNP in banked sera ranged from 11 to 456 ng/mL and the mean was 44 ng/mL. The relative low concentration of HNP in serum, irrespective to cancer /noncancer status, suggested that the source of HNP in these fluid samples could not be due to contamination of blood.
Level of human neutrophil peptides 1 to 3 in ductal lavage fluid from women at high risk of breast cancer. Specificity of HNP1 to 3 to breast cancer was further tested by ELISA in 42 DLF specimens from 14 women at high risk of breast cancer (repeat lavage of both breasts at 6- to 12-month intervals, collected at Northwestern University). Elevation of HNP1 to 3 was only observed in DLF of one woman (patient 11) whereas all 36 samples from the other 13 women tested negative (Fig. 5A). Patient 11 was enrolled in the study due to a family history of breast cancer and a 5-year Gail risk estimate of 1.7%. A total of six fluid samples from two ducts of her left breast and one duct of her right breast were collected at two time points 8 months apart. High level of HNP1 to 3 was observed in all three ducts at the first time point and in two ducts at the second time point (Fig. 5A). The cytologic findings from all ducts were benign and no cancer has been detected on radiologic surveillance with 18 months of follow-up. This woman, as well as other study participants, will continue to be followed.
To exclude the possible effect of protein yield on measurement of HNP1 to 3, we have also plotted the corresponding protein concentration in each sample in Fig. 5B. No correlation was observed between the level of HNP1 to 3 and the protein yield; low expression of HNP is not due to lack of protein in sample.
In this study, we have employed SELDI as the proteomic platform for differential profiling of breast fluids samples. Compared with our experience with conventional two-dimensional gel electrophoresis, SELDI is preferred in this application because it requires much less sample and offers higher throughput. Based on one study reported by Kuerer et al. (9), 80 μg of nipple aspirate fluid proteins were used for each two-dimensional gel analysis, followed by SyproRuby fluorescent staining, whereas 3.6 to 4 μg of proteins were used by two previous SELDI studies (10, 11). With the limitation on protein yield in most of our breast fluid specimens (typically vary between a few micrograms to a few hundred micrograms), minimal sample requirement is pivotal. Using our optimized chip protocol, we were able to obtain reproducible protein profiles using breast fluid samples containing 1 μg of total protein. The minimal sample requirement allowed obtainment of mass spectra from most of our specimens as well as triplicate analysis of the same sample.
Three other studies to date have reported on the use of SELDI in the analysis of proteomic expression patterns in ductal fluids. Paweletz et al. (12) have reported the analysis of NAF on hydrophobic chips from 12 women with breast cancer and 15 healthy controls. Fifty protein peaks were resolved and two proteins were found uniquely present in tumor associated samples (at 4,233 and 9,470 Da) and two were found uniquely associated with the normal samples (at 3,416 and 4,150 Da). The authors observed unusually large variations in the spectra between different NAF samples within a group. This may be related to the biological variability of the breast duct microenvironment of different individuals (as we have also observed) but may also be related to other factors such as collection bias (samples were from three sources and were used as one cohort) and differences in protein concentration of each sample were analyzed (equal fluid volume instead of equal protein was analyzed).
Another study, reported by Sauter et al. (10), has compared protein profiles of NAF samples from 20 subjects with breast cancer and 13 with nondiseased breast. Equal amounts of total proteins (3.6 μg) were analyzed on three different chip surfaces (normal phase, anion exchange, and hydrophobic). This study identified five differentially expressed proteins. The most sensitive and specific proteins were 6,500 and 1,5940 Da, found in 75% to 84% of samples from women with cancer but only in 0% to 9% of samples from normal women.
The third study, reported by Pawlik et al. (11), has compared protein expression profiles of paired NAF samples from 23 patients with unilateral breast cancer, as well as NAF, from five unrelated healthy controls. Individual patterns of proteins secreted by breast ductal cells most likely vary from one woman to another, probably as a function of an individual's specific hormonal milieu. For this reason, comparison between the breasts of the same individual is an attractive approach as she acts as her own internal control for these hormonal stimuli. Based on paired t test on 463 distinct peaks detected on two chip surfaces (cation exchange and metal affinity), the author reported that no significant differences were found in protein expression between the paired samples of the same cancer patient whereas comparison between pooled spectra of breasts of healthy controls and the cancer patients revealed 20 differently expressed peaks (non-tumor-bearing breasts versus healthy controls, 3 peaks; tumor-bearing breasts versus healthy controls, 17 peaks).
It is difficult at the present time to compare results among different studies. Differences on surface chemistry, binding/washing conditions, and instrument setting may all contribute to the binding and detection of different protein species and therefore lead to the discovery of different subset of biomarkers. In addition, different data processing algorithms, peak intensity thresholds, and statistical analysis may all result in different peak profiles and consequently affect the result of marker selection. Although diversity on study protocols would facilitate the detection of a broader spectrum of proteins and therefore generate more candidate biomarkers, the strength of aforementioned peak entities remains weak due to the lack of validations.
To establish a filtering mechanism that could provide some preliminary evaluation on the reliability of the candidate biomarkers, we recruited fluid specimens from multiple institutions. We have trained our biomarker panel using NAF specimens collected by the M.D. Anderson Cancer Center and tested on DLF specimens collected by Johns Hopkins Hospital. Not only were the training and testing samples collected independently, proteomic analysis were also done separately in different experiments on different days. For these reasons, we have more confidence on the validity of BC1 to 3 and BC5 as both markers were consistently elevated in cancer in both data.
To gain insight into the biology of the potential biomarkers, knowing the identity of these proteins is essential. The common approach for protein identification used in our laboratory involves chromatographic protein separation followed by gel electrophoresis, in-gel trypsin digestion, and tandem mass peptide sequencing. Using such an approach, we have in the past successfully identified several biomarkers found in patient serum of ovarian and breast cancer (13).5
J. Li, R. Orlandi, C.N. White, J. Rosenzweig, J. Zhao, E. Seregni, D. Morelli, Y. Yu, X.Y. Meng, Z. Zhang, N.E. Davidson, E.T. Fung, and D.W. Chai, in press.
HNP1 to 3 are members of the α-defensins and are major constituents of the dense azurophilic granules of neutrophils. Besides their diverse functional activities in innate antimicrobial immunity, HNP expression has also been linked to different types of tumors and cell lines. HNP1 peptide has been detected in tissue samples of lung tumors (14) and in the submandibular glands of patients with oral carcinomas (15) whereas expression of HNP1 to 3 has been detected in tissues of renal cell carcinomas (16) and in tissues of colon tumors (17). HNP1 to 3 expression in tumors primarily originates from tumor-invading eosinophils (14) and neutrophils (15, 18) although some tumors might also produce HNP1 to 3 with the same capabilities. By reverse transcription-PCR, mass spectrometry, and flow cytometric analysis, HNP1 to 3 have been shown to be expressed by cell lines derived from renal cell carcinomas (16) and the expression of a specific HNP precursor peptide has been shown to be up-regulated in human leukemic cells (19). Furthermore, it has also been shown that the excess amounts of HNP1 to 3 observed in urine from bladder cancer patients were often produced by the actual bladder cancer cells (20) and that highly invasive bladder cancer cells produced more HNP1 to 3 than did the less invasive ones (as cited in ref. 17). HNP1 to 3 are also known to stimulate bronchial epithelial cells to up-regulate interleukin-8 production (21), a potent neutrophil chemotactic factor. Thus, the up-regulated expression of HNP1 to 3 in tumors may primarily originate from infiltrating neutrophils but could be initiated by HNP1 to 3–producing cancer cells.
Irrespective to the source of expression, the main question is the exact role of HNP1 to 3 in the tumor microenvironment in vivo. As shown in renal cell carcinoma–derived cell lines, HNP1 to 3 stimulated DNA synthesis at 6 to 25 μg/mL and suppressed DNA synthesis at higher concentration (>25 μg/mL). Similarly, HNP1 at 10−4 mol/L (equivalent to 344 × 103 μg/mL) is cytotoxic for human monocytes whereas lower concentration of HNP-1 (10−8-10−9 mol/L, equivalent to 3-30 μg/mL) increases tumor necrosis factor-α production by monocytes (22). Although the exact concentration cutoff for function switch of HNP may vary in different cancers, these in vitro experiments suggest that the function of HNP in relation to tumor growth may be concentration dependent.
In summary, we have in this study established a comprehensive proteomic approach that is suitable for biomarker discovery and validation in breast fluid. We obtained reproducible protein profiles using minimal sample by SELDI and we addressed the issue of individual variability by the application of various informatics tools. Using a multicenter study design, we trained and validated biomarkers of interest on independently collected samples and identified BF1 to 3 as HNP1 to 3. The presence of HNP1 to 3, as well as its association to breast cancer, has not been previously reported. The function of HNP1 to 3 in the tumor microenvironment may be complicated as they can both serve to promote or inhibit tumor growth in a concentration-dependent manner in vitro. In a continuing collaboration with the Northwestern group, we are currently following up a large number of high-risk women who undergo repeated lavage every 6 to 12 months. Level of HNP will be measured and its predictive value for early detection of breast cancer will be investigated.
Grant support: The Breast Cancer Specialized Program of Research Excellence, the Susan Love M.D. Breast Cancer Foundation, the Bluhm Family Program for Breast Cancer Early Detection and Prevention, NIH/National Cancer Institute P50 CA89018-02, and the Ciphergen Biosystems.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
We thank Dr. Susan Love and the Susan Love Breast Cancer Foundation for their support to the development of intraductal approach in breast cancer research.