Abstract
Purpose: The anti-malignin antibody serum (AMAS) test (Oncolab, Boston, MA) has been reported as 97% sensitive and 95% specific for malignancies. To objectively assess accuracy of this test for discrimination of breast cancer, we studied a series of women undergoing core breast biopsy.
Subjects and Methods: Seventy-one core-needle breast biopsies were classified as malignant, suspicious, or benign by two independent pathologists blinded to AMAS results. Corresponding sera were read as AMAS positive, negative, or borderline by criteria used by Oncolab and also using criteria derived from receiver-operator curves based on values for slow (S-tag), fast (F-tag), and their difference (Net-tag) antibody reported by Oncolab. We calculated sensitivity and specificity and analyzed distributions by Fisher's exact test.
Results: Biopsies were read as 42 (59%) benign, 12 (17%) suspicious, and 17 (24%) malignant. By Oncolab criteria, sensitivity (59%) and specificity (62%) were maximized by pooling suspicious with malignant and AMAS borderline with positive (P = 0.098). Receiver-operator curves showed best sensitivity (62%) and specificity (69%) for the criterion AMAS positive if Net-Tag > 135 μg/mL or S-Tag > 220 μg/mL (P = 0.015).
Conclusions: The AMAS test discriminates suspicious and malignant from benign lesions, but sensitivity is insufficient to identify patients to be spared biopsy and false-positive rates are too high for population screening.
Introduction
Cancer screening tests are done with the goal of early diagnosis, but whether early treatment affects long-term cancer outcomes remains controversial (1, 2). Approaches to screening include measurement of circulating biomolecules released by cancers, ectopically (e.g., carcinoembryonic antigen) or eutopically, but in excess (e.g., prostate-specific antigen); detection of DNA sequences or hypermethylation characteristic of specific oncogenes in shed cells (3-5); and detection of antibodies against tumor antigens (6).
Malignin has been described as an 89-amino-acid, 10-kDa peptide, rich in glutamic and aspartic acid. It is obtained from acid extraction of a glioblastoma cell line (7). The molecule has been identified as a moiety of a larger (250 kDa) brain membrane glycoprotein, which seems hypoglycosylated in glioblastoma cells (8). Elevated titers of anti-malignin-binding antibodies have been reported in serum from patients with a variety of malignancies (9) and the anti-malignin antibody serum (AMAS) test has been reported to provide specificity and sensitivity of >95% for cancers, regardless of type (10-12). A recent study (13) found 97% sensitivity and 64% specificity in women with suspicious mammograms and/or palpable breast masses coming for a needle biopsy.
A noninvasive test that could identify women with suspicious breast findings, who have a sufficiently low probability of breast cancer, to spare them the expense and discomfort of a biopsy procedure would be clinically useful. If the AMAS test reliably identifies 97% of patients with breast cancer (13), it could conceivably be employed for this purpose. The sole laboratory performing the AMAS test is Oncolab (Boston, MA). The test has been highly controversial (14). To evaluate the AMAS test for discrimination of cancer in high-risk patients, we conducted a blinded trial in the relevant patient population, women about to undergo an image-guided core-needle breast biopsy.
Materials and Methods
Research Subjects
Subjects were recruited by collaborating radiologist investigators (P.G., B. B-W.) from patients referred for needle biopsy of breast lesions after a suspicious finding on mammography or manual examination. Patients were excluded only for a prior history of malignancy. Of patients invited, ∼10% participated.
Eighty patients were entered into the study at two centers, at Tucson (n = 45) and Scottsdale (n = 35), AZ between April 24, 2001 and October 7, 2002. The protocol was approved by the institutional review boards of the University of Arizona and the Arizona State University. All subjects were informed of the potential risks of the study and signed informed consent documents. Complete data were obtained on 71 patients.
Biopsies
A six-core image-guided 14-gauge needle biopsy was done on each patient. Samples obtained were transferred to 10% buffered formaldehyde and sent to laboratories for processing.
Histopathology
Biopsy samples were processed using standard histopathologic methodology and examined by local histopathologists. In addition, two research pathologists (L.W.R. and J.A.I.), blinded to local pathology readings and AMAS results, evaluated duplicate slides and agreed on a diagnosis. Local and investigator pathologists' readings were compared at the coordinating center. Where disagreement occurred, a third set of slides was sent to the Armed Forces Institute of Pathology for a “tie-breaking” reading. Final readings were classified as malignant, benign, or suspicious and reviewed by the study pathologists who confirmed all but one, which was changed from benign to suspicious to produce the final diagnostic data set.
AMAS Testing
Refrigerated blood samples drawn by venipuncture into BD#6440 vacutainer tubes were allowed to clot at room temperature for 30 to 120 minutes before centrifugation at 1,500 × g for 10 minutes. Sera were transferred to a 3-mL NUNC tissue culture tube with sealing cap and frozen. Each sample was forwarded immediately in an insulated package with dry ice by overnight mail to Oncolab. Samples were identified only by number.
Sera were assayed for anti-malignin antibodies as has been described (15). Briefly, 0.2 mL of patient serum was added to 0.2 mL of washed TARGET reagent (malignin antigen, covalently bound to bromoacetyl cellulose; Brain Research, Inc., New York, NY). After replicate incubations at 4°C with constant shaking for 10 and 120 minutes, reagent is precipitated by centrifugation and triple washed with 0.2 mL ice-cold 0.15 mol/L NaCl for 3 seconds. Bound antibody is eluted by shaking for 3 seconds on a vortex mixer with 0.4 mL of 0.25 mol/L acetic acid, the reagent precipitated by centrifugation, and supernatants read on a spectrophotometer for protein concentration at 280 nm using a known quantity of monoclonal anti-malignin antibody as a standard. Duplicates of elevated and normal control sera are run with each batch of 20 unknown sera. Oncolab has reported coefficients of variation for total anti-malignin antigen antibody complex of ±6% between technologists and ±11% across multiple days and assays (16). Protein concentration in μg/mL after 10 minutes of incubation is labeled fast target absorbed globulin (F-Tag) and that after 120 minutes is slow target absorbed globulin (S-Tag). Specific binding (Net-Tag) is calculated by subtracting F-Tag from S-Tag. At Oncolab, readings are considered borderline for Net-Tag > 100 but <135 and elevated if Net-Tag > 135, S-Tag > 400, or F-Tag > 300 (10).
Statistical Methods
To evaluate the numbers of subjects required to detect significant nonrandom distribution of subjects into true and false-positive and true and false-negative categories, we assumed that a rate of malignancy by biopsy diagnosis could range from 20% to 40% based on the experience of the investigators and on prior published studies (17, 18). Using χ2 analysis on model groupings, we determined that, to detect statistical significance at the 0.05 level, given 65% levels of sensitivity and specificity, at least 70 patients would be required if only 20% had malignant diagnoses by biopsy. Higher levels of sensitivity or specificity or greater numbers of patients with malignancy resulted in more significant P values.
Subjects were classified as AMAS negative or positive, first by using Oncolab criteria then using experimental criteria based on values reported for Net-Tag, F-Tag, and S-Tag, as described below. Statistical evaluations were carried out using StatView (SAS Institute, Cary, NC). We employed ANOVA to compare mean values of Net-Tag, F-Tag, and S-Tag for groups with benign, suspicious, or malignant, pathology for patients rated by Oncolab criteria as either AMAS normal, borderline, or elevated, and for all three Oncolab ratings combined. To evaluate associations, we used x-y plots and linear regression analyses with Pearson's r. Because some Oncolab readings were “borderline,” we evaluated data three ways: borderline excluded, borderline classified as positive, and borderline considered as negative. We calculated sensitivity and specificity two ways: suspicious pooled with benign and suspicious pooled with malignant. Data were entered on an Excel (Microsoft, Redmond, WA) spread sheet, assigning categories of true positive (TP = pathology positive, AMAS positive), false negative (FN = pathology positive, AMAS negative), true negative (TN = pathology negative, AMAS negative), and false positive (FP = pathology negative, AMAS positive) for each patient for each set of trial criteria. For each set of criteria examined, we summed each category and calculated percent sensitivity as 100 × [TP / (TP + FN)] and specificity as 100 × [TN / (TN + FP)]. To identify criteria providing optimal discrimination, we plotted receiver-operator curves, (sensitivity versus 1 − specificity; ref. 19) over ranges of Net-Tag, S-Tag, and F-Tag values, as described below. The optimal criterion for each curve was considered the point at which the difference between sensitivity and 1 − specificity was maximal, at which point the receiver-operator curve deviated from the line of identity to the greatest extent. For each optimal criterion identified, we tested the 2 × 2 distribution of cases using Fisher's exact test (20). A nonrandom distribution was assumed when Ps < 0.05. We also estimated area under the ROC curves (AUC) using the method of successive parallelograms and calculated the area above the neutrality line as an estimate of deviation of each ROC curve from random.
Results
Demography and Diagnoses
Complete biopsy and AMAS data were obtained for 71 women, ages 25 to 83 years (mean, 56.3 ± 1.4). As shown in Table 1, 42 (59%) had benign pathology, 12 (17%) were suspicious, and 17 (24%) were read as malignant. The most common benign diagnoses were fibrocystic disease (n = 24, of which seven were associated with apocrine metaplasia, one with a fibroadenoma, and two with ductal hyperplasia) and fibroadenoma (n = 12). Also classified as benign were atrophic changes (n = 2), ductal hyperplasia without atypia (n = 3), and with apocrine metaplasia with no notation of associated fibrocystic disease (n = 1). Malignant lesions consisted of infiltrating ductal carcinoma (n = 13), ductal carcinoma in situ (n = 1), mixed ductal and lobular carcinoma (n = 1), and lobular carcinoma (n = 2). Diagnoses interpreted as suspicious included adenosis (n = 5), ductal hyperplasia with atypia (n = 4), lobular hyperplasia (n = 2), and papilloma (n = 1).
The pathologist investigators' diagnoses agreed with those of the local histopathologist with one exception, read as “florid ductal hyperplasia without atypia” by the study pathologists but as, “ductal carcinoma in situ,” by the local pathologist. Slides of this biopsy were submitted to the Armed Forces Institute of Pathology, which diagnosed “ductal intraepithelial neoplasia grade 1c,” so that this case was classified as suspicious. Women diagnosed as malignant tended to be older (mean ± SD, 61.8 ± 12.4 years) than those with benign (55.6 ± 12.3 years) or suspicious (51.1 ± 5.8 years) diagnoses, but only the age difference between the malignant and suspicious groups was statistically significant (P = 0.017).
Oncolab AMAS Test Results
Results from Oncolab consisted of values for F-Tag, S-Tag, and Net-Tag and qualitative readings using their criteria, stated above, for limits of F-Tag, S-Tag, and Net-Tag. Mean ages of patients with readings of elevated, borderline, and normal were respectively 52.7 ± 12.2, 57.0 ± 10.9, and 58.2 ± 11.9 years (not significant). As shown in Table 1, results for our subjects were distributed as 22 (31%) elevated, 11 (15%) borderline, and 38 (54%) normal. No trend for distribution of patients with malignant pathology into AMAS elevated and patients with benign pathology into AMAS normal categories was apparent. Consistent with the above, when means (Table 2) for each of the AMAS test values were compared across pathologic diagnoses, no statistical differences were found for subjects classified as AMAS normal, AMAS borderline, AMAS elevated, or all three categories combined.
Analysis of Pathology and AMAS Readings
As shown in Table 3, we calculated sensitivity and specificity after collapsing suspicious and malignant groups, testing three conditions: borderline excluded (reducing total to 60), borderline pooled with normal, and borderline pooled with elevated. Next, we repeated this same sequence after collapsing suspicious and benign groups. The assumptions optimizing sensitivity were borderline as positive (i.e., pooled with elevated) and suspicious pooled with malignant (sensitivity = 59%, specificity = 62%). Higher specificity (79%) was achieved pooling borderline with normal as AMAS negative, while keeping suspicious and malignant together, but this reduced sensitivity (45%). Excluding borderline readings produced intermediate sensitivity and specificity values without improving discrimination. Pooling suspicious with benign consistently led to lower levels of both specificity and sensitivity. In no case did distributions reach statistical significance, but Ps more closely approached significance (all P < 0.1) with suspicious and malignant patients pooled.
Case Distributions by Antibody Values
We next evaluated whether alternative criteria might provide better diagnostic discrimination than Oncolab's qualitative readings. We first constructed x-y plots of S-Tag versus Net-Tag (Fig. 1A) and versus F-Tag (Fig. 1B) for each classification. As seen in Fig. 1A, S-Tag values have a strong positive association with Net-Tag (r = 0.94, P < 0.0001) and a weaker positive association with F-Tag (r = 0.47, P < 0.0001). Inspection (Fig. 1A and B) reveals substantial overlap of benign, suspicious, and malignant cases with no obvious separation of categories.
Diagnostic Discrimination by Antibody Values
To search for criteria providing optimal discrimination, we examined receiver-operator curves for Net-Tag, varied in increments of 10 from 50 to 250 μg/mL. With suspicious cases included with benign, the curve (Fig. 2) falls very close to the line of identity (i.e., no discriminative power). When suspicious and malignant are pooled, the curve deviates to the left with the greatest difference from the line of identity occurring at Net-Tag values of about 120 to 135 μg/mL. The criterion AMAS positive if Net-Tag > 125 provided the best discrimination (maximum distance from the line of identity) and produced case distributions shown in the first two lines of Table 4, which show that when suspicious cases were pooled with benign, discrimination (sensitivity = 41%, specificity = 70%, P = 0.39) was not as good as when suspicious cases were considered with malignant (sensitivity = 48%, specificity = 79%, P = 0.022). As shown in the next two lines of Table 4, keeping suspicious with malignant cases, we found that slightly lower (120 μg/mL) or higher (130 μg/mL) Net-Tag values reduced discrimination and resulted in a loss of statistical significance. For Net-Tag, the AUC above the line of neutrality with suspicious and malignant pooled was 876. For suspicious and benign pooled, the AUC was 222. We also constructed receiver-operator curves (data not shown) for F-Tag and S-Tag values considered singly. Corresponding AUC values for F-Tag were 1,077 and 1,180 and for S-Tag 1152 and 715. As shown in Table 4, the best discrimination using F-Tag alone occurs at F-Tag > 140 μg/mL, when suspicious cases are counted with benign, giving a specificity of 91% but a sensitivity of only 35% (P = 0.018). Whereas when suspicious is considered with malignant, the best separation occurs at F-Tag > 110 μg/mL providing specificity and sensitivity both equal to 62% (P = 0.057). A search for the most discriminating S-Tag criteria revealed that S-Tag > 220 μg/mL gives a sensitivity of 59% and with specificity of 63% (P = 0.16) for suspicious included with benign, and the identical level of sensitivity but an increase in specificity to 69% (P = 0.028) when suspicious is considered with malignant (Table 4).
We next investigated whether Net-Tag and S-Tag values considered together might provide additional power for discrimination. Using the algorithm, “AMAS = positive IF Net-Tag > n1 OR S-Tag > n2,” we constructed receiver-operator curves (Fig. 3) for S-Tag incremented by 10 from 110 to 250 μg/mL at Net-Tag values (65, 100, 135, and 150 μg/mL), chosen arbitrarily to bracket widely the optimal value determined above of 125. We found that curves shifted left as Net-Tag was increased from 65 to 135 μg/mL but moved no further at a Net-Tag value of 150 μg/mL. AUCs above the line of identity were, respectively, 176, 737, 1,295, and 1,281. The greatest apparent distance from the line of identity occurred for the criterion “AMAS = positive IF Net-Tag > 135 OR S-Tag > 220,” as shown in Table 4 (sensitivity = 62%, specificity of 69%, P = 0.015). Exploration of the mathematical space immediately adjacent to these criteria by varying Net-Tag and F-Tag limits up and down in increments of five failed to improve discrimination (data not shown).
Finally, because F-Tag showed less correlation with S-Tag than did Net-Tag (see Fig. 1B), we constructed a series of receiver operator curves (data not shown) to investigate whether some combination of F-Tag and S-Tag criteria might give improved discrimination. Using the algorithm, “AMAS = positive IF S-Tag > n1 AND F-Tag > n2,” for F-Tag levels incremented by 10 from 80 to 150 μg/mL at each S-Tag value starting at 150 and incremented by 5 up to 180 μg/mL, we found the best discrimination was for S-Tag > 160 AND F-Tag > 110 (Table 3) giving a sensitivity of 59% at a specificity of 67% (P = 0.051). Discrimination was not improved by small variations around these values.
Discussion
The AMAS test has been reported to have sensitivities ranging from 95% to 100% for a variety of malignancies, with specificities ranging from 60% to 95% (10-13). These levels seem high, given that serum tests for cancer such as carcinoembryonic antigen (21, 22) or prostate-specific antigen (23, 24) do not achieve comparable discrimination. In the best-controlled previous AMAS study (13), in women with suspicious breast findings coming for a needle biopsy, sensitivity was reported as 97% and specificity as 64%. However, there were relatively few biopsy patients without cancer (n = 11, 25%), so that a group of healthy (and potentially noncomparable) women were used as additional controls. Our study employed a similar design but with careful blinding and rigorous histopathologic review. The distribution observed (24% malignant and 11% suspicious) resembles that seen in larger series of similar patients (18).
Sensitivity by Oncolab's criteria seemed better for suspicious lesions (75%) than for cancer (47%) with a value of 59% overall. The specificity of 62% was similar to prior findings of 54% in biopsy patients and 64% with additional normal controls (13). Empirical criteria based on a receiver-operator curve with an algorithm combining Net-Tag (>135 μg/mL) and S-Tag (>220 μg/mL) values improved discrimination somewhat (sensitivity = 62%, specificity = 69%, P = 0.015). These cutoff values are similar, but not identical, to the criteria specified by Oncolab.
Because needle core biopsy has a reported specificity of 99% (18), it is unlikely that any patients classified as malignant in our study did not have cancer. Thus, the sensitivity of only 62% is probably not due to misdiagnosis of malignancy in patients testing AMAS negative.
Positive AMAS tests in the majority of patients with “suspicious” lesions may be due to expression of malignin in neoplasias not histologically identifiable as cancer or to the presence of cancers undetected by core biopsy. Our suspicious category corresponded to those categorized by Lee et al. (25) as suspicious (B4) or of uncertain malignant potential (B3) after a core-needle biopsy, of which, respectively, 86% and 25% were upgraded to malignancy upon excision. In another study, 48% of patients with biopsy cores classified as atypical ductal hyperplasia and intraductal atypia of uncertain significance had diagnoses of carcinoma in subsequent excisions (26). Thus, it is likely that most patients in our suspicious category harbored malignancies and were appropriately included with the malignant classification for analysis.
Evaluation of specificity presents the problem of uncertainty as to whether an AMAS “false-positive” patient has an undetected cancer. Previous studies have found cancers in excision biopsies of 4% (17), 8% (27), and 12% (18) of patients with benign core-needle diagnoses. Using the “worst-case scenario,” if 12% of our 42 biopsy-benign patients were misclassified; assuming that these patients were AMAS positive, “correcting” the diagnoses would only increase sensitivity to 68% and specificity to 78%. In addition, some patients may be AMAS positive, because they harbor occult malignancies other than breast cancer. In a series of 1,026 patients and controls, of those initially AMAS “false positive,” 23% were diagnosed with cancer within 19 months (12). The contribution of occult malignancy to confounding remains uncertain, because the extent to which a known positive AMAS test contributed to detection bias is unclear, nor was the number of AMAS-negative patients who developed cancer reported. According to the National Cancer Institute statistics (28), cumulative incidence of new cancer diagnoses in women 55 to 60 years of age is ∼1.6% in 5 years. Therefore, it is very unlikely that >1 of our 42 patients with benign pathology harbored occult non-breast neoplasia.
There are a number of possible reasons why we found the AMAS test to be less discriminating than did previous reports (10-13). First, as previously noted by Smith (14), prior studies lacked rigor, in that specimens came from multiple sources; pathology reports were not independently reviewed; and/or blinding procedures were inconsistently employed. In addition, patients were diverse; some had early or occult malignancies, whereas others had known cancers, with and without metastases or prior treatment. Another possibility is that recent changes in procedures, reagents, or controls at Oncolab produced less accurate AMAS testing. Third, our study, being restricted to a particular kind of cancer and stage of diagnostic uncertainty, may not be representative of the more varied patient population reported previously, an explanation not applicable to the Thornethwaite study (13). Finally, because our series was relatively small, our results could represent the random inclusion in the study of an unusual number of patients with malignant disease but low AMAS titers. Only further studies with larger numbers of patients would elucidate this possibility.
To conclude, we found that AMAS testing discrimination of neoplasia was significantly different from random, with sensitivities and specificities similar to other clinical tests such as prostate-specific antigen (29) and carcinoembryonic antigen (30-32). However, in middle-aged women with a breast finding suspicious for cancer, the test was insufficiently sensitive to allow a confident decision for or against biopsy, and false-positive rates were too high for screening of low-risk populations. Whether an improved method for detection of anti-malignin antibody might prove useful as a component of a test battery for cancer detection is a subject for future experimentation and research.
Addendum
We inquired by follow-up letter to each of the 71 subjects whether they had surgery following biopsy and whether at this, or a subsequent surgery or biopsy, they had received a diagnosis of malignancy. A total of 40 subjects (56%) responded, all between 24 and 36 months after the date of biopsy. Of these, 12 were biopsy positive, all of whom had surgery and all of whom reported definitive diagnoses of malignancy. Four of the responders had suspicious biopsy diagnoses and, of these, only one had surgery. She also had a malignancy. The other three subjects with suspicious diagnoses reported no further surgery or biopsies. Finally, of the 24 responding subjects with benign biopsy diagnoses, three reported subsequent diagnoses of malignancy. Distribution of AMAS classification, using our “best” criterion (AMAS = positive IF Net-Tag > 135 OR S-Tag > 220), is shown in Table 5. Thus, on follow-up, pooling suspicious with malignant gives a total of 19 subjects with neoplasia, of whom 11 were AMAS positive (sensitivity = 58%). Using the same data, specificity was 62%. These values are similar to those derived from the larger data set based on initial biopsy results.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.