Abstract
An unmet need in low-resource countries is an automated breast cancer detection assay to prioritize women who should undergo core breast biopsy and pathologic review. Therefore, we sought to identify and validate a panel of methylated DNA markers to discriminate between cancer and benign breast lesions using cells obtained by fine-needle aspiration (FNA).
Experimental Design: Two case–control studies were conducted comparing cancer and benign breast tissue identified from clinical repositories in the United States, China, and South Africa for marker selection/training (N = 226) and testing (N = 246). Twenty-five methylated markers were assayed by Quantitative Multiplex-Methylation-Specific PCR (QM-MSP) to select and test a cancer-specific panel. Next, a pilot study was conducted on archival FNAs (49 benign, 24 invasive) from women with mammographically suspicious lesions using a newly developed, 5-hour, quantitative, automated cartridge system. We calculated sensitivity, specificity, and area under the receiver-operating characteristic curve (AUC) compared with histopathology for the marker panel.
In the discovery cohort, 10 of 25 markers were selected that were highly methylated in breast cancer compared with benign tissues by QM-MSP. In the independent test cohort, this panel yielded an AUC of 0.937 (95% CI = 0.900–0.970). In the FNA pilot, we achieved an AUC of 0.960 (95% CI = 0.883–1.0) using the automated cartridge system.
We developed and piloted a fast and accurate methylation marker–based automated cartridge system to detect breast cancer in FNA samples. This quick ancillary test has the potential to prioritize cancer over benign tissues for expedited pathologic evaluation in poorly resourced countries.
Breast cancer screening in developed countries has saved millions of lives. In developing countries, delayed diagnosis of breast cancer is a major contributor to high breast cancer mortality rates. Timely diagnosis is dependent on rapid pathologic review of patient samples, a service that is often not available. We propose that a quick ancillary test to prioritize malignant versus benign cases could lead to earlier detection and better survival from cancer. DNA methylation is a molecular hallmark of breast cancer. We developed a panel of cancer-specific DNA methylation markers for use in our automated breast cancer detection cartridge. In a pilot study of fine-needle aspirates of lesions detected during breast imaging, the cartridge system was 96% sensitive and 90% specific compared with cytopathology. This automated 5-hour assay has the potential to expedite the diagnosis of breast cancer in low-resource settings and thereby help reduce the global mortality rates.
Introduction
In developing countries of the world, breast cancer is the most frequently diagnosed cancer in women and the leading cause of cancer-related death (1, 2). Breast cancer incidence is rising globally because of longer life expectancies, decreased burden of infectious diseases, and changes in reproductive risk factors (3–5). Decreases in overall breast cancer mortality in the United States and Canada are associated with advances in screening and in adjuvant therapy (6, 7). Low-resource countries have poor breast cancer survival rates partly because of lack of organized screening programs, late stage at diagnosis, and limited access to timely, standard treatment (1, 5). To reduce the global burden of breast cancer, we require novel approaches to breast cancer screening, early detection programs, and access to treatment in low- and middle-income countries of the world (8, 9).
In underdeveloped countries, there is a critical shortage of clinical pathologists and resources to perform the complex histopathologic evaluation of breast biopsies (10, 11). Currently, even after the breast lesion is biopsied, there may be delays of up to 10 months for a definitive diagnosis in many South American and African countries (10, 11). Therefore, developing a molecular test for quick and accurate determination of a suspicious breast lesion will enable more efficient use of limited resources and earlier detection of breast cancer.
Hypermethylation of CpG islands located in the promoter regions of tumor suppressor genes is recognized as one of the earliest, most frequent, and robust alterations in cancer development (12, 13). However, low-level methylation occurs in certain CpG regions in benign conditions (14–19). Euhus and colleagues performed a genome-wide search for differentially methylated markers in cancer and benign breast epithelial cell lines treated with 5-azacytidine. A select number of markers were evaluated in fine-needle aspirations (FNAs) of 97 cancer and 327 benign disease cases. Significant differential methylation was reported for CPNE8, PSAT1, CXCL14, CLDN1, and GNE(14). However, no further validation of these markers was performed. Other researchers have reported modest differences for methylation markers in serum that failed to provide sufficient ability to distinguish between cancer and benign/normal controls (20, 21). Thus, better markers and further studies are warranted.
In this article, we report the selection of a panel of DNA methylation markers that distinguishes between benign and cancer breast tissues with high sensitivity and specificity. We incorporated the panel into an automated breast cancer detection cartridge system that quantitated gene methylation with a high level of accuracy within 5 hours. We tested this assay in a blinded pilot study of archival breast FNA samples from cancer and benign lesions. Our results suggest that the cartridge assay has potential to detect breast cancer in FNA samples of suspicious breast lesions identified by mammography or ultrasound-based screening, and thereby could have utility to prioritize breast lesions that require biopsy, especially in low-resource countries.
Materials and Methods
Study design
Fifty cases (invasive ductal carcinoma and ductal carcinoma in situ) and 50 controls (benign breast disease) were sought from each of the three participating sites, United States, China, and South Africa, a design that would result in a total of 300 samples, or 75 cases and 75 controls in each of the training and test phase of development. With 75 samples per group, the precision of a one-sided, 95% confidence bound is less than 0.1, and precision is even better for sensitivities and specificities near 1.0. However, collection exceeded expectation and we were able to increase the sample size by more than 50%. A total of 449 formalin-fixed paraffin-embedded (FFPE) breast tissues from the three regions were randomly assigned to training (N = 226) and test (N = 223) case–control sets (Fig. 1) using the Excel RANDBETWEEN function. The randomization was carried out separately for each geographical region, and type of lesion to ensure balance (Table 1). Later, 23 invasive lobular carcinomas (ILCs) were added to the test set to ensure inclusion of this significant pathologic subset of breast carcinoma. Our study design (Fig. 1) was as follows: First, 25 hypermethylated candidate gene markers were screened by Quantitative Multiplex Methylation-Specific PCR (QM-MSP) in the training cohort to identify a minimal marker panel of 10 genes using the predetermined criteria described below. This marker panel was evaluated to establish the methylation threshold best able to distinguish between benign and cancer samples. Because of limited available DNA, it was not possible to evaluate all candidate markers in all samples of the training cohort, so the final set of markers was measured in 206 of the 226 samples. Similarly, the marker panel was examined by QM-MSP in a test cohort of 220 samples, consisting of 197 samples with sufficient DNA plus an additional new sample set of 23 ILCs. We evaluated the utility of the assay methylation threshold (defined in the training cohort) for the novel gene panel (Fig. 1).
Sample Sets . | Training (FFPE) . | Test (FFPE) . | Pilot (FNA) . | ||||||
---|---|---|---|---|---|---|---|---|---|
Region; total | United States | China | South Africa | Total | United States | China | South Africa | Total | Total |
Sample N; total | 87 | 89 | 50 | 226 | 109 | 88 | 49 | 246 | 76 |
Sample N; with 10 marker CMI | 68 | 88 | 50 | 206 | 85 | 87 | 48 | 220 | 73 |
IDC, N | 29 | 29 | 13 | 71 | 29 | 30 | 13 | 72 | 24 |
Receptor status | |||||||||
ER/PR+, HER2− | 11 | 19 | 8 | 38 | 3 | 18 | 5 | 26 | 16 |
ER/PR+, HER2+ | 2 | 2 | 3 | 7 | 1 | 3 | 3 | 7 | 2 |
ER/PR−, HER2+ | 2 | 4 | 0 | 6 | 1 | 1 | 1 | 3 | 1 |
ER/PR−, HER2− | 4 | 4 | 2 | 10 | 11 | 7 | 3 | 21 | 3 |
Unknown | 10 | 0 | 0 | 10 | 13 | 1 | 1 | 15 | 2 |
Age (in years) | |||||||||
Median | 56 | 53 | 50 | 54 | 54 | 50 | 55 | 54 | 59 |
Range | 25–87 | 23–66 | 32–74 | 23–87 | 25–70 | 35–75 | 34–88 | 25–88 | 30–87 |
Invasive lobular carcinoma, N | — | — | — | — | 23 | — | — | 23 | 0 |
Receptor status | |||||||||
ER/PR+ | — | — | — | — | 19 | — | — | 19 | — |
ER/PR− | — | — | — | — | 4 | — | — | 4 | — |
Unknown | — | — | — | — | — | — | — | — | — |
Age (in years) | |||||||||
Median | — | — | — | — | 58 | — | — | 58 | — |
Range | — | — | — | — | 38–81 | — | — | 38–81 | — |
Ductal carcinoma in situ, N | 18 | 8 | 13 | 39 | 18 | 7 | 13 | 38 | 0 |
Age (in years) | |||||||||
Median | 58 | 54 | 59 | 56 | 56 | 53 | 46 | 53 | — |
Range | 42–78 | 25–71 | 37–76 | 25–78 | 42–79 | 45–72 | 26–82 | 26–82 | — |
Benign breast disease, N | 40 | 52 | 12 | 104 | 39 | 51 | 12 | 102 | 52 |
Age (in years) | |||||||||
Median | 51 | 46 | 39 | 47 | 52 | 48 | 37 | 47 | 41 |
Range | 26–78 | 26–68 | 17–62 | 17–78 | 36–75 | 23–73 | 19–60 | 19–75 | 19–74 |
Normal breast, N | — | — | 12 | 12 | — | — | 11 | 11 | 0 |
Age (in years) | |||||||||
Median | — | — | 45 | 45 | — | — | 45 | 45 | — |
Range | — | — | 37–96 | 37–96 | — | — | 23–69 | 23—69 | — |
Sample Sets . | Training (FFPE) . | Test (FFPE) . | Pilot (FNA) . | ||||||
---|---|---|---|---|---|---|---|---|---|
Region; total | United States | China | South Africa | Total | United States | China | South Africa | Total | Total |
Sample N; total | 87 | 89 | 50 | 226 | 109 | 88 | 49 | 246 | 76 |
Sample N; with 10 marker CMI | 68 | 88 | 50 | 206 | 85 | 87 | 48 | 220 | 73 |
IDC, N | 29 | 29 | 13 | 71 | 29 | 30 | 13 | 72 | 24 |
Receptor status | |||||||||
ER/PR+, HER2− | 11 | 19 | 8 | 38 | 3 | 18 | 5 | 26 | 16 |
ER/PR+, HER2+ | 2 | 2 | 3 | 7 | 1 | 3 | 3 | 7 | 2 |
ER/PR−, HER2+ | 2 | 4 | 0 | 6 | 1 | 1 | 1 | 3 | 1 |
ER/PR−, HER2− | 4 | 4 | 2 | 10 | 11 | 7 | 3 | 21 | 3 |
Unknown | 10 | 0 | 0 | 10 | 13 | 1 | 1 | 15 | 2 |
Age (in years) | |||||||||
Median | 56 | 53 | 50 | 54 | 54 | 50 | 55 | 54 | 59 |
Range | 25–87 | 23–66 | 32–74 | 23–87 | 25–70 | 35–75 | 34–88 | 25–88 | 30–87 |
Invasive lobular carcinoma, N | — | — | — | — | 23 | — | — | 23 | 0 |
Receptor status | |||||||||
ER/PR+ | — | — | — | — | 19 | — | — | 19 | — |
ER/PR− | — | — | — | — | 4 | — | — | 4 | — |
Unknown | — | — | — | — | — | — | — | — | — |
Age (in years) | |||||||||
Median | — | — | — | — | 58 | — | — | 58 | — |
Range | — | — | — | — | 38–81 | — | — | 38–81 | — |
Ductal carcinoma in situ, N | 18 | 8 | 13 | 39 | 18 | 7 | 13 | 38 | 0 |
Age (in years) | |||||||||
Median | 58 | 54 | 59 | 56 | 56 | 53 | 46 | 53 | — |
Range | 42–78 | 25–71 | 37–76 | 25–78 | 42–79 | 45–72 | 26–82 | 26–82 | — |
Benign breast disease, N | 40 | 52 | 12 | 104 | 39 | 51 | 12 | 102 | 52 |
Age (in years) | |||||||||
Median | 51 | 46 | 39 | 47 | 52 | 48 | 37 | 47 | 41 |
Range | 26–78 | 26–68 | 17–62 | 17–78 | 36–75 | 23–73 | 19–60 | 19–75 | 19–74 |
Normal breast, N | — | — | 12 | 12 | — | — | 11 | 11 | 0 |
Age (in years) | |||||||||
Median | — | — | 45 | 45 | — | — | 45 | 45 | — |
Range | — | — | 37–96 | 37–96 | — | — | 23–69 | 23—69 | — |
Finally, these findings were translated to a newly developed, automated breast cancer detection DNA methylation cartridge. We performed a pilot study of archival FNAs (N = 76) collected from Portugal and Hong Kong (Fig. 1).
The study, performed on archival, formalin-fixed cancer, benign and normal breast samples was approved by Institutional Review Boards of Johns Hopkins (Baltimore, MD); Emory University School of Medicine (Atlanta, GA); Renmin Hospital, China (Wuhan, China); University of Witwaterstrand (Johannesburg, South Africa); Instituto de Patologia e Imunologia Molecular da Universidade do Porto (Portugal); and The Chinese University of Hong Kong (Hong Kong Special Administrative Region). The study was conducted in accordance with Declaration of Helsinki and the U. S. Common Rule. Patients provided informed consent to use excess tissue and cells for research. Consent requirement was waived at The Chinese University of Hong Kong, and excess slides of FNA biopsy were used.
Patient materials
DNA was extracted from FFPE tissues (N = 472; 449 randomized to training and test + 23 ILC in test) obtained from Johns Hopkins Surgical Pathology, Renmin Hospital of Wuhan University, China, and National Health Laboratory Service, South Africa, following review by a breast pathologist to confirm correct classification. Breast cancer samples included invasive ductal carcinoma (IDC), invasive lobular carcinoma (ILC), and ductal carcinoma in situ (DCIS). Two noncancer groups were studied: normal breast and benign breast disease. The cancer FFPE blocks ranged in age from 2-28 years (median, 4 years), and the benign/normal blocks ranged in age from 2 to 19 years (median, 2 years). A large majority of our FFPE samples (cancer: 71%; benign: 89%) were 2–10 years old. Archival diagnostic stained slides of FNA smears from 24 cases of IDC and 52 benign lesions (26 fibroadenoma, 19 benign ductal epithelium, five fibrocystic, two cyst) were obtained from Medical Faculty of University of Porto (Portugal, N = 33) and The Chinese University of Hong Kong (China, N = 43). Of these, three benign lesions were discarded for lack of sufficient DNA. In the FNA pilot, the median age of the FNA slides from IDC was 10 years (range, 1–11 years) and for benign disease was 3 years (range, 1–11 years). Patient characteristics are provided in Table 1. The IHC subtype information for estrogen/progesterone receptor (ER/PR) and HER2 was available for a subgroup of the invasive cancer samples in the training and test sets (N = 108).
Assay development, marker selection, and training
Quantitative Multiplex Methylation-Specific PCR (22, 23) was used in the training and test sets for marker selection and evaluation. Twenty-five individual markers were assayed using DNA from FFPE tissues (N = 226) in the training cohort. Marker selection criteria required first, that markers show considerably higher median methylation in cancer than in benign/normal tissue (P < 0.001, based on the Mann–Whitney test). Second, to further minimize the risk of false positives, benign and normal samples were required to be methylated at lower levels and less frequently compared with cancer samples. Specifically, we calculated the 75th percentile of methylation separately in tumor and benign samples, discarding any markers in which the 75th percentile of normal methylation was high, or where the difference between normal and tumor was small (Supplementary Table S1; Supplementary Fig. S1). The selected markers were then evaluated as a panel in the training set of N = 206 samples with sufficient DNA. QM-MSP values for the panel were expressed as cumulative methylation index (CMI), the sum of percent methylation for each gene in the panel (22, 23). Using receiver-operating characteristic (ROC) curve analysis, we identified the laboratory CMI threshold for the study that maximized specificity while maintaining sensitivity at 90%.
Marker testing.
The marker panel was assayed in the test FFPE samples by QM-MSP using locked parameters defined in the training set. The independent test set (N = 220) consisted of 66 IDC, 23 ILC, 30 DCIS, 91 benign, and 10 normal breast tissues. For the marker panel, sensitivity, specificity, and AUC were evaluated.
Differences in DNA methylation based on IHC subtype and geographic origin.
We evaluated differences in DNA methylation among four IHC subtypes of breast cancer, using clinical data for expression of estrogen receptor/progesterone receptors (ER/PR) and HER2: ER/PR+, HER2−; ER/PR+, HER2+; ER/PR−,HER2+; ER/PR−, HER2− (triple-negative breast cancer, TNBC) in samples from training and test sets (total N = 108). CMI of the 10-gene panel for the four IHC subtypes was compared using the Kruskal–Wallis test. Relative differences in percent methylation of each gene were evaluated in two-way comparisons by the Mann–Whitney test. The Fisher exact test was performed to evaluate the difference in frequency of samples that test positive (higher than the ROC CMI threshold) among the IHC subtypes. For the marker panel, sensitivity, specificity, and AUC were evaluated for tumors from each subtype.
We evaluated tumors from three geographical regions, United States, China, and South Africa, for differences in DNA methylation of the 10-gene panel. Cumulative methylation and individual gene methylation was assessed for the cancer and benign samples by the Kruskal–Wallis test. For the marker panel, sensitivity, specificity, and AUC were evaluated for tumors from each region.
Assay development on the GeneXpert platform
Cartridge design.
The GeneXpert System (Cepheid) consists of a computer preloaded with software for running GeneXpert cartridges tailored for PCR-based diagnostics. The automated methylation-specific PCR system described in this study is a prototype in development, not for use in diagnostic procedures, and has not been reviewed by any regulatory body. The assay has three single-use cartridges, one of which holds reagents required for bisulfite treatment of DNA (Cartridge A) and the two remaining cartridges hold reagents for quantitative PCR of Marker Set 1 or Marker Set 2 (Cartridge B). Each marker set consists of five target genes and ACTB as the internal reference, and utilizes six fluorophores.
Cartridge performance.
To determine the sensitivity of detection of methylated DNA in the cartridge, CAMA-1 breast cancer cells (200, 100, and 50 cells; ATCC, authenticated by the Johns Hopkins Genetic Resources Core Facility), which are fully methylated by QM-MSP for each of the genes in our marker panel, were tested in six replicates. To generate standard curves for quantitation of methylation in the cartridge and for testing interassay reproducibility, CAMA-1 cells were mixed with fully unmethylated human sperm DNA (HSD) prior to lysis.
Sample preparation.
Archival stained FNA slides were soaked in xylene to remove the coverslip, and scraped into proteinase K/FFPE lysis buffer solution (20 μL of proteinase K and 1.2-mL lysis buffer; FFPE Lysis Kit, 900-0697, Cepheid), digested at 80°C for 30 minutes, mixed with 1.2 mL of ethanol, and loaded into the bisulfite conversion Cartridge A. Methylated CAMA-1 cells, or CAMA-1 cells mixed with unmethylated HSD DNA, were similarly digested and loaded into Cartridge A. Upon completion of the reaction, bisulfite-treated DNA was transferred to the Set 1-Cartridge B or Set 2-Cartridge B to quantitate methylation.
Quantitation of DNA methylation.
Ct values were obtained using the GeneXpert software for methylated targets and ACTB reference (Ct = the cycle threshold at which signal fluorescence exceeds background). For calculating percent methylation, the ΔCt (Ct Gene – Ct ACTB) value of each target gene was extrapolated from historic standard curves of mixtures of methylated and unmethylated DNA ranging from 100% to 3% methylation. This enabled quantitation of cumulative methylation (CM), which is the sum of percent methylation for all genes in the marker panel.
Interassay reproducibility was analyzed in the cartridge by performing six replicate assays using mixtures of fully methylated CAMA-1 cells and fully unmethylated HSD to achieve methylation values ranging from 100% to 3%. Reproducibility was reported as coefficient of variation (CV) as calculated by GraphPad Prism version 7.04.
Interoperator reproducibility was tested by two individuals who performed the assay using separate aliquots of the same lysate from 23 archival FNA samples. The R package psych (https://CRAN.R-project.org/package=psych) was used to evaluate interoperator reproducibility in cumulative methylation values using intraclass correlation coefficient (ICC). Cohen kappa coefficient (κ), calculated using the R package fmsb and function Kappa.test, was used to measure interrater agreement for presence or absence of cancer cells in the FNA based on quantitative methylation values.
Clinical pilot testing of FNA samples.
A pilot study for breast cancer detection was performed on 76 FNAs for the selected 10-gene panel of methylated markers in the automated GeneXpert cartridge. Percent methylation for each marker analyzed in the cartridge was calculated by interpolation, using standard curves as described above, and evaluated by nonlinear fit regression in GraphPad Prism and expressed as cumulative methylation.
Results
Selection of methylated gene markers
We evaluated 25 candidate gene markers in the formalin-fixed paraffin-embedded (FFPE) archival training set samples by QM-MSP to identify those with the highest sensitivity and specificity for cancer based on histopathology of the core biopsy (Fig. 1). Candidate gene markers were among those previously identified by us as having frequent measurable gene hypermethylation in ER/PR+, HER2−; ER/PR+, HER2+; ER/PR−, HER2+, and ER/PR−, HER2− breast cancer (23, 24). Nine of the 25 markers were discarded on the basis of their inability to significantly distinguish between cancers (IDC/DCIS) versus benign/normal breast tissues in the training set. Four additional candidate markers were discarded either because normal/benign samples had high methylation, or because the absolute difference in methylation levels was small. Among the 12 markers thus selected, TWIST1 and MAL were discarded (Supplementary Fig. S1). The final panel consisted of 10 markers: AKR1B1, APC, CCND2, COL6A2, HIST1H3C, HOXB4, RASGRF2, RASSF1, TMEFF2, and ZNF671. Each of the gene markers showed significantly higher methylation in cancer than benign/normal samples (Fig. 2).
Evaluation of performance of the 10-gene marker panel
Using training set samples, cumulative methylation for the 10-gene panel was higher in cancer compared with benign/normal samples as shown in the histogram (Fig. 3A) and in the box plot (Fig. 3A, inset). ROC analyses of the training set established the laboratory threshold that provided the optimal specificity while retaining ≥90% sensitivity. A sensitivity of 90% (95% CI = 82%–95%), a specificity of 88% (95% CI = 80%–93%), and an AUC = 0.948 (95% CI = 0.914–0.976) was achieved in the training data at a threshold of 14.5 CMI units (Fig. 3A, inset).
Verification of the performance of the 10-gene marker panel
Parameters defined in the training set were locked for analysis in the independent test set of 220 cancer, benign, and normal breast tissues using QM-MSP. We calculated the cumulative methylation in the test set for all 10 genes in the panel, shown as a histogram (Fig. 3B). The marker panel in the test set was significantly more methylated in IDC/ILC/DCIS compared with benign/normal tissues as shown in the box plot (P < 0.0001; Fig. 3B, inset; Mann–Whitney). Using ROC analyses, the sensitivity achieved was 87% (95% CI = 80%–93%) and specificity was 88% (95% CI = 80%–94%) at a CMI threshold of 14.5 units (AUC = 0.937, 95% CI = 0.900–0.970; P < 0.0001; Fig. 3B, inset). We also analyzed the performance of the assay for detecting IDC, ILC, and DCIS individually (Supplementary Fig. S2). For DCIS alone, the sensitivity was 77% (95% CI = 58%–90%) and specificity was 88% (95% CI = 80%–94%), and the AUC was 0.881 (95% CI = 0.777–0.962; Supplementary Fig. S2). For detection of IDC and ILC alone, sensitivity was higher at 91% (95% CI = 81%–97%) and 91% (95% CI = 72%–99%) respectively, specificity was 88% and AUC was (95% CI = 80%–94%; Supplementary Fig. S2).
Influence of age on DNA methylation in test and training sets
Age of blocks.
Formalin and other fixation methods used in preparation of FFPE tissues can result in degradation of DNA, which may increase with the age of the blocks (25, 26). Since our FFPE tissues ranged in age from 2-28 years, we tested the influence of block age on CMI by QM-MSP assay. Our results showed that linear regression model of CMI as a function of FFPE block age, fit on benign/normal samples, demonstrates a modest but not significant decrease in methylation with greater block age (P = 0.064; R2 = 0.017) (Supplementary Fig. S4) as assessed by QM-MSP.
Age of patients.
Age-related DNA hypermethylation in normal tissues has been observed to be associated with the subsequent development of cancer (27–29). This raises a concern for cancer detection tests since the presence of hypermethylation in normal tissue can be a source of false positives. Therefore, we investigated whether CMI of our marker panel was associated with patient age. A linear regression model of CMI fit on benign/normal samples demonstrated a modest but not significant increase in methylation with greater age (P = 0.063, R2 = 0.017; Supplementary Fig. S3A) with the QM-MSP assay. A coefficient of 0.011 corresponds with a doubling of CMI from 5 to 10 units over 91 years. We also performed logistic regression models of misclassification rates as a function of age, When fit on benign/normal samples, these data show a modest increase in the log-odds of false positives with increased age (P = 0.054; Supplementary Fig. S3B). When fit on tumor samples, these data show a decrease in the risk of false negatives for older patients, although the effect is very small and not statistically significant (P = 0.478).
Methylation frequency by IHC subtype and region
IHC subtype.
Using a laboratory threshold value of 14.5 CMI units, each of the subtypes had a similar percentage of tumors that scored positive by QM-MSP (86%–100%; P = 0.50 by Fisher exact; Supplementary Table S2A) and no significant difference (P = 0.077) in CMI of the 10-gene panel (Supplementary Fig. S5A). However, between the four subtypes, the individual markers showed a varying extent of methylation (Supplementary Table S2B; Supplementary Fig. S5A). For example, ZNF671 was significantly more methylated in ER/PR−, HER2− (TNBC) compared with ER/PR+, HER2− tumors (P < 0.0001). For ER/PR−, HER2+ tumors the methylation of APC, RASGRF2, RASSF1, and TMEFF2 was higher than for other IHC subtypes (P values not adjusted; Supplementary Tables S2B and S5).
Region.
Cumulative methylation of the 10-gene panel did not differ significantly between the regions of United States, China, and South Africa (P = 0.265 for benign/normal and P = 0.474 for cancer, Kruskal–Wallis; Supplementary Fig. S5B; Supplementary Table S3). However, for individual markers, regional differences were detected for CCND2, COL6A2, HIST1H3C, and ZNF671 (Kruskal–Wallis P values = 0.010, 0.003, 0.003, 0.043, respectively; Supplementary Fig. S5B). AUC values showed modest differences between United States (AUC = 0.941; 95% CI = 0.890–0.983), China (AUC = 0.901; 95% CI = 0.813–0.970), and South Africa (AUC = 0.958; 95% CI, 0.872–1.00; Supplementary Fig. S5C).
Collectively, the 10-gene marker panel performed reliably as a “pan breast cancer” detection tool across the main IHC subtypes of breast cancer and within the three distinct regional populations evaluated.
Development of an automated cartridge for quantitative analysis of the 10-gene marker panel
QM-MSP requires substantial technical expertise and a relatively long 10-day turnaround time from DNA extraction to quantitation of cumulative methylation of a 10-gene marker panel. Therefore, we endeavored to develop the test on the automated GeneXpert assay platform that requires minimal technical training, could be completed within a few hours of sample collection, and could be widely applied in breast screening clinics in low-resource regions of the world. The distribution of the gene markers in the two cartridges was determined by systematically optimizing combinations of the six gene probes and their fluorophore tags, PCR reagents, PCR cycle numbers, and external amplicon dilutions. The optimized conditions were tested using mixtures of fully methylated CAMA-1 breast cancer cells and unmethylated human sperm DNA.
Interassay reproducibility.
To test interassay reproducibility, we analyzed Set 1 and Set 2 markers in the cartridge in six replicate assays. We examined a range of dilutions of fully methylated CAMA-1 cell DNA diluted with human sperm DNA, which is unmethylated for all 10 markers, keeping the amount of input DNA constant (1-ng total DNA). The assays were highly reproducible over a wide range of dilutions with the coefficient of variation (CV) ranging from 6% to 23% (Fig. 4A). The 10-marker set maintained log2 linearity for each marker (r2 = 0.78–0.98) in the dilution range of 3.12% to 100% of methylated DNA, indicating sensitivity of the cartridge assay to detect as low as 3% methylated target DNA.
Interoperator reproducibility.
Interoperator reproducibility of the cartridge assay was tested by two individuals blinded to the identity of the samples using two aliquots of the same FNA lysate (12 IDC and 11 benign). Cumulative methylation measurements of the 10 markers were highly reproducible between operators, with an intraclass correlation coefficient (ICC) of 0.99 (Fig. 4B). Kappa statistic for interrater reliability, using the threshold of 14.5 CMI from the test cohort, was 0.91 (95% CI: 0.75–1.08, P = 6.76e-6).
Clinical pilot testing of FNA samples.
Archival, stained FNA cells from mammography-detected lesions (24 IDC and 52 benign breast) were obtained from Hong Kong and Portugal, and analyzed in a blinded manner. Three benign samples with <50 cells and ACTB Cts >35 were excluded from data analysis. Robust marker methylation, quantitated using dilution curves ranging from 100% to 3.12% methylation for assessing percent methylation in the sample (Supplementary Fig. S6), was detectable in FNA samples of all except one cancer, as shown in the histogram in Fig. 5. Difference in cumulative methylation between cancer and benign samples was highly significant (P < 0.0001, Mann–Whitney) as shown in the box plot in Fig. 5 (inset). The GeneXpert system achieved a detection sensitivity of 96% (95% CI = 79%–100%) and a specificity of 90% (95% CI = 78%–97%) using ROC threshold of 14.5 CM (P < 0.0001; AUC = 0.960; 95% CI = 0.883–1.00; Fig. 5, inset; Supplementary Table S4). The assay, from sample preparation to results, was completed within 5 hours.
Discussion
In this article, we report the systematic development of a methylated 10-gene marker panel that distinguishes between benign and cancer tissues with a high level of sensitivity and specificity. We also incorporated this panel into an automated breast cancer detection cartridge on the GeneXpert system to reduce time and increase throughput. The cartridge assay performed with good accuracy, with a sensitivity of 96% (95% CI = 79%–100%) and specificity of 90% (95% CI = 78%–97%); AUC = 0.960, (95% CI = 0.883–1.00) in a pilot study of archival, clinical FNA samples from Portugal and Hong Kong. The assay is simple to perform and has a time-to-result of less than 5 hours. Thus, the assay could be easily deployed to underserved regions of the world for the discrimination of breast cancer from benign lesions. Our results support the utility of assessing methylated markers in FNA for breast cancer screening in women with suspicious lesions.
Our data and outcomes attest to the robustness of the 10-gene panel of markers (Figs. 3 and 5) to distinguish between benign and cancer lesions. We did not observe significant differences in cumulative methylation of the 10-gene marker panel by geography or IHC subtype, indicating that our marker panel captures the diversity across different geographic regions and ER/PR/HER2 and triple-negative IHC subtypes of breast cancer. Some genes in the panel, such as APC, CCND2, RASGRF2, and TMEFF2 were frequently methylated in ER+/PR+ tumors, APC, AKR1B1, and RASSF1 were frequently methylated in HER2-overexpressed tumors, while ZNF671 was more frequently hypermethylated in TNBC, also consistent with reports in the literature (30–32).
Our study has several strengths as well as a few limitations. We chose to develop a DNA methylation–based assay because aberrant DNA methylation is detectable very early in nearly all breast cancers (23, 33, 34). As has been well established, gene promoter–associated CpG island methylation can silence tumor suppressor genes, much like gene deletions or mutations. Moreover, methylation alterations, in contrast to mutations, are extremely common in breast cancer (35–37). This allowed us to develop a marker set capable of detecting breast cancer in more than 90% of the samples in ethnically varied populations of the world irrespective of IHC subtype. Another important feature of the test is the high specificity in differentiating between normal and benign tissues displayed in large sample sets for discovery, training, and testing that included a wide variety of benign lesions encompassing cysts, usual ductal hyperplasia, fibroadenoma, and papilloma. A key limitation is that the FNA pilot was small (N = 73) and used archival samples of convenience. Further validation of the GeneXpert assay in prospective blinded clinical trials of FNA samples of benign and malignant lesions in various clinical settings is necessary to assess the full potential of this assay to reduce the time to breast cancer detection.
Our findings also demonstrated several advantages of automated GeneXpert breast cancer detection using the GeneXpert cartridge. Careful selection of a 10-gene marker panel using large cohorts of cancer and benign tissue was performed using a highly sensitive laboratory assay, QM-MSP. But, QM-MSP requires technical expertise and is too time consuming for easy operation in the field. The cartridge advantages are as follows: (i) automated, quick, bisulfite conversion of the sample; (ii) in each of two cartridges, six fluorophores in multiplex allowed simultaneous amplification of five test markers and the actin control, reducing assay time from 10 days for QM-MSP for the 10-gene marker panel to 5 hours or less; (iii) both interassay and interoperator tests validated its precision; (iv) the cartridge assay performed with the predicted high accuracy in a pilot study of 73 FNAs from mammographically suspicious breast lesions; (v) the simplified assay, data analysis, and interpretation are automated, leaving little margin for operator errors and; finally, (vi) this newly developed methylation cartridge could be easily adapted for detection of other cancers.
In conclusion, our study describes the development of a methylated DNA–based breast cancer detection assay from marker discovery to clinical pilot testing, for use on the automated GeneXpert system. We have prototyped the first automated breast cancer detection cartridge and demonstrated its analytic and clinical validity. We have defined and validated cutoffs for the discrimination of IDC/ILC/DCIS from benign breast disease. Furthermore, we have demonstrated the feasibility of applying this assay to FNA samples collected from mammographically detected cancer and benign breast lesions. Our future steps are to prospectively validate the FNA-based molecular detection test in major hospitals in the United States, China, and African Nations. This automated 5-hour assay could enable our ultimate goal to expedite diagnosis and thereby reduce mortality from breast cancer in the underdeveloped nations of the world.
Disclosure of Potential Conflicts of Interest
A. Cimino-Mathews is a consultant/advisory board member for and reports receiving commercial research grants from Bristol-Myers Squibb. M. Bates and A.C. Wolff have ownership interests (including patents) at Cepheid. K. Visvanathan reports receiving commercial research grants from Cepheid. M.J. Fackler has ownership interests (including patents) at Cepheid, is a consultant/advisory board member for Cepheid, and reports receiving commercial research grants from Cepheid. S. Sukumar has ownership interests (including patents) at, is a consultant/advisory board member for and reports receiving commercial research grants from Cepheid. No conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: B. Downs, K. Kocmond, E.W. Lai, M.J. Fackler, S. Sukumar
Development of methodology:B. Downs, A. Cimino-Mathews, R. Sood, S. Tulac, T.N. de Guzman, E.W. Lai, M.J. Fackler, S. Sukumar
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): B. Downs, C. Mercado-Rodriguez, A. Cimino-Mathews, C. Chen, J. Yuan, E.J. Van Den Berg, F. Schmitt, G.M. Tse, S.Z. Ali, R. Sood, J. Li, A.L. Richardson, M. Mosunjac, M. Rizzo, K. Kocmond, E.W. Lai, S. Sukumar
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): B. Downs, C. Mercado-Rodriguez, A. Cimino-Mathews, C. Chen, L. Cope, S.Z. Ali, D. Meir-Levi, E.W. Lai, A.C. Wolff, M.J. Fackler, S. Sukumar
Writing, review, and/or revision of the manuscript: B. Downs, A. Cimino-Mathews, E.J. Van Den Berg, L. Cope, F. Schmitt, G.M. Tse, S.Z. Ali, J. Li, A.L. Richardson, S. Tulac, K. Kocmond, E.W. Lai, M. Bates, A.C. Wolff, S. Harvey, C.B. Umbricht, K. Visvanathan, M.J. Fackler, S. Sukumar
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): B. Downs, C. Mercado-Rodriguez, C. Chen, S. Tulac, K. Kocmond, T.N. de Guzman, E.W. Lai, M. Bates, E. Gabrielson, M.J. Fackler, S. Sukumar
Study supervision: B. Downs, M. Rizzo, E.W. Lai, B. Rhees, S. Sukumar
Acknowledgments
We would like to acknowledge engineering support from Paul Jordan for the preparation of Cepheid cartridges and Bert Vogelstein for reviewing the manuscript. We thank ArLena Smith and Jeffrey Reynolds for regulatory support. This work was supported by grants to SS from Under Armor (80040851), AVON Foundation for Research (01-2017-007), Cepheid (Research Agreement, 90066820), and the CCSG Core Grant (P30-CA006973). B. Downs is supported by a postdoctoral fellowship from DOD-award: BC171982. F. Schmitt is partially supported by the project NORTE-01-0145-FEDER-000003, Norte Portugal Regional Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.