Abstract
Whole-blood DNA methylation markers have been suggested as potential biomarkers for early detection of breast cancer. We conducted a systematic review of the literature on whole-blood DNA methylation markers for breast cancer detection. PubMed and ISI Web of Knowledge were searched up to May 29, 2018. Overall, 33 studies evaluating 355 markers were included. The diagnostic value of most individual markers was relatively modest, with only six markers showing sensitivity >40% at specificity >75% [only 2 (HYAL2 and S100P) were independently validated]. Although relatively strong associations (OR ≤0.5 or OR ≥2) with breast cancer were reported for 14 markers, most of them were not independently validated. Two prospective studies performed epigenome-wide association analysis and identified 276 CpG sites related to breast cancer risk, but no overlap was observed between CpGs reported from these two studies. Five studies incorporated individual markers as panels, but only two of them used a test-validation approach. In conclusion, so far detected methylation markers are insufficient for breast cancer early detection, but markers or marker-combinations may be useful for breast cancer risk stratification. Utilizing high-throughput methods of methylation quantification, future studies should focus on further mining informative methylation markers and derivation of enhanced multimaker panels with thorough external validation ideally in prospective settings.
Introduction
Breast cancer, with 1.7 million cases and >500,000 deaths in 2012, is the most frequently diagnosed cancer and the leading cause of cancer death among females worldwide (1). Mammography is currently the most widely-used method for breast cancer screening (2, 3) and evidence has shown that it could reduce breast cancer mortality in women aged 50 to 69 years (4). However the benefit for women aged 40 to 49 is uncertain (5), and risk of false positive results and overdiagnosis are of concern (2, 4). Given that screening for specific cancers with certain noninvasive blood tests, such as CA125 for detecting ovarian cancer, could increase patient compliance and has already shown relatively good diagnostic performance (6, 7), there also might be a possible approach for non-invasive early detection and/or risk stratification of breast cancer (7).
Epigenetic modifications, such as DNA methylation, are heritable and modifiable markers with the ability to regulate gene expression without changing the underlying DNA sequence (8, 9). Previous investigations of DNA methylation changes in plasma and serum have revealed that DNA methylation of gene-specific loci could provide a promising alternative for non-invasive breast cancer screening (10–15). Moreover, several investigations on whole blood also indicated the potential utility of peripheral blood cell or white blood cell (WBC) DNA methylation for prediction of breast cancer (16–29). In this article, we systematically reviewed published DNA methylation studies in whole-blood-borne DNA from patients with breast cancer and healthy controls, to provide an overview on the performance of DNA methylation markers for breast cancer risk evaluation and/or early detection.
Materials and Methods
This systematic literature review was carried out according to a predefined protocol. Reporting follows the PRISMA statement (30).
Literature search
A systematic literature search was carried out to identify studies that evaluated whole-blood DNA methylation in patients with breast cancer and healthy controls. PubMed and ISI Web of Knowledge were searched for relevant articles published up to May 29, 2018. The search was done using the following keyword combinations: [(breast neoplasm or breast cancer or breast carcinoma or breast tumor or cancer of breast or cancer of the breast or mammary carcinoma or malignant neoplasm of breast or malignant tumor of breast or mammary cancer or mammary carcinoma or mammary neoplasm or neoplasm, breast or tumor, breast) and (blood or leukocyte) and (DNA or deoxyribonucleic acid or ds-DNA) and (methylation) and (risk or susceptibility or risk maker or susceptibility marker or detection or diagnosis or screen or screening or marker or biomarker)]. Duplicated publications were removed.
Eligibility criteria
Only articles written in English were included in our review. The initial screen was done based on reading the title and abstract. We excluded articles that were (i) not relevant to the topic, (ii) nonoriginal articles or reviews, (iii) not based on whole blood samples, (iv) focusing on global DNA methylation, (v) not available as full-text version. The second round of screening involved reading the full articles. Articles without any relevant information regarding the P value for the comparison of methylation levels, ORs, AUC, or sensitivity/specificity were further excluded as well. In addition, we also identified a number of papers from cross-referencing (Supplementary Fig. S1).
Data extraction
Two authors (ZG and HY) independently read and retrieved data from the studies that met the above described inclusion and exclusion criteria. Any inconsistencies were discussed and resolved among the investigators. The following variables were extracted: first author, publication year, country, study design, age of study population, DNA methylation assay, information on targeted genetic region, and essential results, including P value for differential methylation levels between cases and controls, ORs, AUC, sensitivity, and specificity. In case sensitivity and specificity were not explicitly reported, they were calculated from available tables and figures. Crude ORs were also calculated for the studies (19, 31–33) with the essential information provided in the articles.
Quality assessment
The quality of each included article was assessed according to quality assessment of diagnostic accuracy studies (QUADAS-2), and the QUADAS-2 tables were completed by using Review Manager (version5.3). The risk of bias and concerns regarding applicability were evaluated in four domains: patient selection, index test, reference standard, and flow and timing. The risk of bias as well as concerns regarding applicability were classified in each domain as low, high, or unclear (Supplementary Figs. S2 and S3; ref. 34).
Results
Literature search result
The initial search yielded 784 articles using the above-mentioned search terms (Supplementary Fig. S1). After the removal of 230 duplicate articles, titles and abstracts of 554 articles were carefully reviewed, resulting in 34 articles that underwent full-text review. Eight articles were excluded because relevant information on P value, ORs, AUC, sensitivity, or specificity could not be extracted. In addition, seven articles were identified through cross-referencing. Finally, 33 studies met the inclusion criteria and were included in this review. Information on ORs could be extracted or calculated in 25 articles. In the other eight studies, only the results of the statistical tests for the differences of DNA methylation levels between breast cancer cases and controls were reported. Sensitivity and specificity could be extracted in 18 studies and AUCs were only reported in five studies.
Study characteristics
The study characteristics are provided in Supplementary Table S1. Of the 33 identified studies, 16 were carried out in Europe [five studies from Germany (20, 22, 23, 26, 35) and five studies from the United Kingdom (16, 18, 28, 36, 37)], nine studies were from Asian countries (17, 19, 21, 33, 38–42) and three studies were from the United States (29, 43, 44). The majority of included studies recruited breast cancer cases from clinical settings and one study (45) was restricted to cases with family history of breast cancer. Five studies recruited participants from the general population (18, 26–28, 46). In the study by Xu and colleagues (29), eligible participants were breast cancer-free at recruitment but had a biological sister diagnosed with breast cancer. In six nested case–control studies (18, 27–29, 37, 46), cases were identified during follow-up of prospective cohorts, and controls were selected from participants remaining free from breast cancer. In these studies, methylation was quantified in baseline blood samples taken years before breast cancer occurrence. In the other 27 case–control studies, blood samples from breast cancer cases were collected at the time of diagnosis or shortly after diagnosis but before any treatment. An exception was the study by Gupta and colleagues (47), where 23 of 66 cases received chemotherapy or radiotherapy prior to blood collection. The age range (in four studies), average age (in 13 studies), or median age (in five studies) was reported in 22 studies. Age distributions were mostly comparable between cases and controls due to matching, but major age differences with substantially lower age of controls were reported in some studies (21, 33, 48). Various techniques (10 types) were used for measuring methylation, with methylation specific polymerase chain reaction (MSP) being the most commonly used. Array-based assays, such as the Infinium HumanMethylation 27K or 450K Beadchip were applied in eight studies (20, 22, 23, 27–29, 35, 46). The majority of the identified studies did not use a validation set to confirm their results, and only eight studies (18, 20, 22, 23, 27–29, 35) performed independent validations.
Based on QUADAS, the overall risk-of-bias and applicability assessments for all included studies are reported in Supplementary Figs. S2 and S3. Twenty-two of the 33 studies were judged as being at low risk, eight at unclear risk, and three (32, 33, 49) at high risk of patient selection bias. Fourteen out of 33 studies were judged as having unclear bias for the conduct or interpretation of the index test due to nonquantitative methylation methods used in these studies. Regarding the reference standard for cancer diagnosis, all studies were judged as being at low risk. Finally, 20 of 33 studies were considered as having unclear risk of bias concerning the patients flow.
Overview of whole-blood based DNA methylation markers
Of the included 33 studies, 25 studies applied a gene-specific approach and reported 70 methylation markers, and eight studies (20, 22, 23, 27–29, 35, 46) identified methylation markers using array-based technologies (Illumina 27K or 450K beadchip) and reported 285 differentially methylated CpG sites. Among these 285 CpG sites, nine CpG sites including C7orf50-cg03916490, GREB1-cg18584561, HYAL2-cg27091787, MGRN1-cg00736299, RPTOR-cg06418238, RAPSN-cg27466532, PNKD-cg01741999, TMC3-cg27639199, and S100P-cg22266967 were further confirmed in independent samples from the same studies (20, 22, 23, 27). We summarized these nine CpG sites together with the 70 gene-specific markers in Table 1 and present the other 276 CpG sites in Supplementary Table S2. As shown in Table 1, 20 of the 79 markers were reported ≥2 times and the remaining markers were only reported once. BRCA1 methylation was assessed most frequently (11 times), followed by CDH1 (5 times), and APC (4 times). The frequency of statistically significant findings for individual markers reported ≥2 times ranged from 0% to 100%, for example IGF2 and RASSF1A were reported twice (36, 37) and three times (24, 25, 48), respectively, with nonsignificant results in each of the studies, CXCL12 (25, 48) and SFRP1 (22, 26) were both reported twice with significant results in all of the studies, and the other 16 markers were reported with significant results in parts of studies that investigated the corresponding markers. Table 1 also shows the direction of methylation changes (hypo- or hypermethylation) for breast cancer cases compared with controls. Among the genes evaluated ≥2 times, the majority were hypermethylated among cases, and only PTGS2 (22, 26), RARβ (46), and SFRP1 (22, 26) were hypomethylated. Among the 59 gene-specific methylation markers reported only once, 13 showed statistically significant hypermethylation, 13 showed hypomethylation, and 33 presented no association with BC. On the other side, 214 of the 276 EWAS derived CpG sites were hypomethylated, and 62 were hypermethylated (28, 29). Supplementary Tables S1 and S2 present the targeted genetic region of all markers. Compared with the eight array-based studies which could detect extensive CpG sites across the genome, the vast majority of gene-specific studies assessed the methylation levels in the promoter region of targeted genes, and only few studies assessed methylation levels of the gene body (16, 18), or imprinted differentially methylated regions (iDMR; ref. 36).
Associations of methylation markers with breast cancer
In assessing the associations between DNA methylation markers and breast cancer, 18 studies reported statistically significant ORs for 27 target loci (Tables 2 and 3 Table 4). Studies that only provided P values (difference in methylation levels between cases and controls) or nonsignificant ORs (for breast cancer risk) are shown in Supplementary Tables S2 and S3.
Dichotomized methylation of specific genes showing significant associations with breast cancer
Gene . | First author, year, ref. number . | Country . | Number cases/controls . | Age (year) cases/controls . | DNA methylation assay . | OR (95% CI) . | P valueg . |
---|---|---|---|---|---|---|---|
BRCA1 (promoter) | Iwamoto, 2011 (17) | Japan | 200/200 | 50/50 | MSP | 1.73 (1.01–2.96)a | 0.045 |
BRCA1 (promoter) | Wong, 2011 (45) | Australia | 255/169 | <40/<40 | MS-HRM | 3.50 (1.4–10.5)b | 0.004 |
BRCA1 (promoter) | Gupta, 2014 (47) | Poland | 66/36 | 63/64 | MS-HRM | 5.00 (1.10–23.30)b | 0.03 |
BRCA2 (promoter) | Wojdacz, 2011 (49) | Denmark | 180/108 | 56.2/56.6 | MS-HRM | 0.28 (0.10–0.90)b | 0.04 |
CDH1 (promoter) | Cho, 2015 (43) | USA | 1021/1036 | – | Methylight | 0.65 (0.54–0.79)c | – |
CLOCK (promoter) | Hoffman, 2010 (44) | USA | 75/80 | 30–80/– | MSP | 0.23 (0.08–0.69)d | 0.008 |
DBC2 (promoter) | Mirzaei, 2012 (31) | Iran | 50/30 | 55.7/38 | MSP | 4.97 (1.52–16.29)e | – |
P14ARF (promoter) | Askari, 2013 (42) | India | 150/150 | – | MSP | 1.99 (1.68–2.36)b | <0.001 |
P16INK4α (promoter) | Askari, 2013 (42) | India | 150/150 | – | MSP | 2.14 (1.82–2.50)b | <0.001 |
P21/CIP1 (promoter) | Askari, 2013 (41) | India | 150/150 | – | MSP | 2.31 (1.95–2.70)b | <0.001 |
PEG3 (imprinted region) | Harrison, 2015 (36) | UK | 189/363 | 56/56 | Pyrosequencing | 1.08 (1.02–1.14)f | 0.008 |
PTEN (promoter) | Yari, 2016 (21) | Iran | 103/50 | 49.8/36.6 | MSP | 3.10 (1.63–5.95)b | <0.001 |
RARβ (promoter) | Cho, 2015 (43) | USA | 1021/1036 | – | Methylight | 0.67 (0.55–0.81)c | – |
Gene . | First author, year, ref. number . | Country . | Number cases/controls . | Age (year) cases/controls . | DNA methylation assay . | OR (95% CI) . | P valueg . |
---|---|---|---|---|---|---|---|
BRCA1 (promoter) | Iwamoto, 2011 (17) | Japan | 200/200 | 50/50 | MSP | 1.73 (1.01–2.96)a | 0.045 |
BRCA1 (promoter) | Wong, 2011 (45) | Australia | 255/169 | <40/<40 | MS-HRM | 3.50 (1.4–10.5)b | 0.004 |
BRCA1 (promoter) | Gupta, 2014 (47) | Poland | 66/36 | 63/64 | MS-HRM | 5.00 (1.10–23.30)b | 0.03 |
BRCA2 (promoter) | Wojdacz, 2011 (49) | Denmark | 180/108 | 56.2/56.6 | MS-HRM | 0.28 (0.10–0.90)b | 0.04 |
CDH1 (promoter) | Cho, 2015 (43) | USA | 1021/1036 | – | Methylight | 0.65 (0.54–0.79)c | – |
CLOCK (promoter) | Hoffman, 2010 (44) | USA | 75/80 | 30–80/– | MSP | 0.23 (0.08–0.69)d | 0.008 |
DBC2 (promoter) | Mirzaei, 2012 (31) | Iran | 50/30 | 55.7/38 | MSP | 4.97 (1.52–16.29)e | – |
P14ARF (promoter) | Askari, 2013 (42) | India | 150/150 | – | MSP | 1.99 (1.68–2.36)b | <0.001 |
P16INK4α (promoter) | Askari, 2013 (42) | India | 150/150 | – | MSP | 2.14 (1.82–2.50)b | <0.001 |
P21/CIP1 (promoter) | Askari, 2013 (41) | India | 150/150 | – | MSP | 2.31 (1.95–2.70)b | <0.001 |
PEG3 (imprinted region) | Harrison, 2015 (36) | UK | 189/363 | 56/56 | Pyrosequencing | 1.08 (1.02–1.14)f | 0.008 |
PTEN (promoter) | Yari, 2016 (21) | Iran | 103/50 | 49.8/36.6 | MSP | 3.10 (1.63–5.95)b | <0.001 |
RARβ (promoter) | Cho, 2015 (43) | USA | 1021/1036 | – | Methylight | 0.67 (0.55–0.81)c | – |
Abbreviation: ref., reference.
aModel adjusted for age, family history, age at menarche, parity, menopausal status, and body mass index.
bNo information for model covariates adjustment.
cModel adjusted for age, family history, body mass index, and physical activity.
dModel adjusted for age, race, family history of breast cancer, study site, menopausal status, parity, and age at menarche.
eCrude odds ratio.
fModel adjusted for menopausal status, age, and weight.
gStatistical significance for OR.
Quantitative methylation of specific genes with significant associations to risk of breast cancer
Gene . | First author, year, ref. number . | Country . | Number cases/controls . | Age (year) cases/controls . | DNA methylation assay . | ORc (95% CI) . | ORd (95% CI) . | ORe (95% CI) . | ORf (95% CI) . |
---|---|---|---|---|---|---|---|---|---|
ATMmvp2a (gene-body) | Flanagan, 2009 (16) | UK | 190/190 | 62.8/62.8 | Pyrosequencing | 0.52 (0.28–1.00) | – | – | – |
ATMmvp2a | Brennan, 2012 (18) | UK | 640/741 | 52/55 | Pyrosequencing | 0.53 (0.38–0.74) | – | – | – |
ATMmvp2b (gene-body) | Flanagan, 2009 (16) | UK | 190/190 | 62.8/62.8 | Pyrosequencing | – | 0.32 (0.17–0.56) | – | – |
RAPSN (TSS1500) | Tang, 2016 (23) | Germany | 568/541 | 50.2/48.8 | MassARRAYg | – | 2.04 (1.45 -2.88) | – | – |
RPTOR (gene-body) | Tang, 2016 (23) | Germany | 568/541 | 50.2/48.8 | MassARRAY | – | 2.81 (1.97–4.01) | – | – |
MGRN1 (gene-body) | Tang, 2016 (23) | Germany | 565/540 | 50.2/48.8 | MassARRAY | – | 5.14 (3.48–7.60) | – | – |
NUP155 (I) | Widschwendter, 2008 (26) | Germany | 307/653 | 50–74/50–74 | Methylight | – | – | 1.03 (1.00–1.07) | – |
NEUROD1 (promoter) | Widschwendter, 2008 (26) | Germany | 299/642 | 50–74/50–74 | Methylight | – | – | 1.04 (1.01–1.07) | – |
SFRP1 (promoter) | Widschwendter, 2008 (26) | Germany | 321/676 | 50–74/50–74 | Methylight | – | – | 1.04 (1.01–1.07) | – |
TITF1 (promoter) | Widschwendter, 2008 (26) | Germany | 321/676 | 50–74/50–74 | Methylight | – | – | 1.04 (1.00–1.08) | – |
ZNF217 (II) | Widschwendter, 2008 (26) | Germany | 303/638 | 50–74/50–74 | Methylight | – | – | 1.04 (1.01–1.07) | – |
S100P (1st exon) | Yang, 2017 (20) | Germany | 206/235 | 43/43 | MassARRAYh | 2.37 (1.79–3.13) | – | ||
S100Pa (1st exon) | Yang, 2017 (20) | Germany | 189/189 | 60/63 | MassARRAY | 2.37 (1.63–3.44) | – | ||
S100Pa (1st exon) | Yang, 2017 (20) | Germany | 156/151 | 46/45 | MassARRAY | 2.84 (1.79–4.50) | – | ||
HYAL2b (TSS1500) | Yang, 2014 (22) | Germany | 338/507 | 46.1/44.7 | MassARRAYe | – | – | 4.18 (3.36–5.19) | – |
HYAL2b (TSS1500) | Yang, 2014 (22) | Germany | 189/189 | 59.6/61.2 | MassARRAY | – | – | 8.14 (5.37–12.34) | – |
C7orf50 (gene-body) | Joo, 2018 (27) | Australia | 433/433 | – | HM 450K | 0.83 (0.72–0.96) | |||
GREB1 (TSS1500) | Joo, 2018 (27) | Australia | 433/433 | – | HM 450K | 1.18 (1.03–1.36) | |||
TMC3 (TSS200) | Joo, 2018 (27) | Australia | 433/433 | – | HM 450K | 1.19 (1.03–1.36) | |||
PNKD (gene-body) | Joo, 2018 (27) | Australia | 433/433 | – | HM 450K | 1.26 (1.03–1.54) |
Gene . | First author, year, ref. number . | Country . | Number cases/controls . | Age (year) cases/controls . | DNA methylation assay . | ORc (95% CI) . | ORd (95% CI) . | ORe (95% CI) . | ORf (95% CI) . |
---|---|---|---|---|---|---|---|---|---|
ATMmvp2a (gene-body) | Flanagan, 2009 (16) | UK | 190/190 | 62.8/62.8 | Pyrosequencing | 0.52 (0.28–1.00) | – | – | – |
ATMmvp2a | Brennan, 2012 (18) | UK | 640/741 | 52/55 | Pyrosequencing | 0.53 (0.38–0.74) | – | – | – |
ATMmvp2b (gene-body) | Flanagan, 2009 (16) | UK | 190/190 | 62.8/62.8 | Pyrosequencing | – | 0.32 (0.17–0.56) | – | – |
RAPSN (TSS1500) | Tang, 2016 (23) | Germany | 568/541 | 50.2/48.8 | MassARRAYg | – | 2.04 (1.45 -2.88) | – | – |
RPTOR (gene-body) | Tang, 2016 (23) | Germany | 568/541 | 50.2/48.8 | MassARRAY | – | 2.81 (1.97–4.01) | – | – |
MGRN1 (gene-body) | Tang, 2016 (23) | Germany | 565/540 | 50.2/48.8 | MassARRAY | – | 5.14 (3.48–7.60) | – | – |
NUP155 (I) | Widschwendter, 2008 (26) | Germany | 307/653 | 50–74/50–74 | Methylight | – | – | 1.03 (1.00–1.07) | – |
NEUROD1 (promoter) | Widschwendter, 2008 (26) | Germany | 299/642 | 50–74/50–74 | Methylight | – | – | 1.04 (1.01–1.07) | – |
SFRP1 (promoter) | Widschwendter, 2008 (26) | Germany | 321/676 | 50–74/50–74 | Methylight | – | – | 1.04 (1.01–1.07) | – |
TITF1 (promoter) | Widschwendter, 2008 (26) | Germany | 321/676 | 50–74/50–74 | Methylight | – | – | 1.04 (1.00–1.08) | – |
ZNF217 (II) | Widschwendter, 2008 (26) | Germany | 303/638 | 50–74/50–74 | Methylight | – | – | 1.04 (1.01–1.07) | – |
S100P (1st exon) | Yang, 2017 (20) | Germany | 206/235 | 43/43 | MassARRAYh | 2.37 (1.79–3.13) | – | ||
S100Pa (1st exon) | Yang, 2017 (20) | Germany | 189/189 | 60/63 | MassARRAY | 2.37 (1.63–3.44) | – | ||
S100Pa (1st exon) | Yang, 2017 (20) | Germany | 156/151 | 46/45 | MassARRAY | 2.84 (1.79–4.50) | – | ||
HYAL2b (TSS1500) | Yang, 2014 (22) | Germany | 338/507 | 46.1/44.7 | MassARRAYe | – | – | 4.18 (3.36–5.19) | – |
HYAL2b (TSS1500) | Yang, 2014 (22) | Germany | 189/189 | 59.6/61.2 | MassARRAY | – | – | 8.14 (5.37–12.34) | – |
C7orf50 (gene-body) | Joo, 2018 (27) | Australia | 433/433 | – | HM 450K | 0.83 (0.72–0.96) | |||
GREB1 (TSS1500) | Joo, 2018 (27) | Australia | 433/433 | – | HM 450K | 1.18 (1.03–1.36) | |||
TMC3 (TSS200) | Joo, 2018 (27) | Australia | 433/433 | – | HM 450K | 1.19 (1.03–1.36) | |||
PNKD (gene-body) | Joo, 2018 (27) | Australia | 433/433 | – | HM 450K | 1.26 (1.03–1.54) |
Abbreviations: HM 450K, Infinium HumanMethylation450K BeadChip; Mvp, methylation variable positions; ref., reference; TSS, transcriptional start site.
aRef. 20 reported AUC values (0.81; 95% CI, 0.77–0.86), (0.68; 95% CI, 0.62–0.73), and (0.70; 95% CI, 0.64–0.75) in three independent validations.
bRef. 22 reported AUC values (0.83; 95% CI, 0.80–0.86) and (0.89; 95% CI, 0.85–0.92) in two independent validations.
cModel calculated for lowest quintile versus highest quintile.
dModel calculated for lowest quartile versus highest quartile.
eModel calculated for per 10% lower methylation.
fModel calculated for methylation M values (per 1 SD). Ref. 18 adjusted by 5-year age categories and study cohorts, ref. 26 adjusted for age and family history of breast cancer, and refs. 20, 22 and 23 adjusted for age and different batches for the measurements. Ref. 27 adjusted for body mass index, tobacco smoking, alcohol drinking, time between blood collection and cancer diagnosis, and sample type (dried blood spots, peripheral blood mononuclear, and buffy coats).
gDiscovery round used Human 450K methylation array.
hDiscovery round used Human 27K methylation array.
Diagnostic performance of methylation panels
Gene panel . | First author, year, ref. number . | Country . | Number cases/controls . | Age (year) cases/controls . | DNA methylation assay . | AUC . | ORa (95% CI) . | P value . |
---|---|---|---|---|---|---|---|---|
MGRN1+RAPSN+RPTOR | Tang, 2016 (23) | Germany | 109/102 | 46.6/42.6 | MassARRAY | 0.79 (0.73–0.85)b | – | – |
MGRN1+RAPSN+RPTOR | Tang, 2016 (23) | Germany | 189/189 | 59.6/59.1 | MassARRAY | 0.60 (0.54–0.66)c | – | – |
MGRN1+RAPSN+RPTOR | Tang, 2016 (23) | Germany | 270/250 | 44.3/44.8 | MassARRAY | 0.62 (0.57–0.67)d | – | – |
ANKRA2 + CDCA4 + ERCC1 + KM-NH-1 + NEK6 | Xu, 2013 (29) | USA | 25/56 | – | HM 27K | 0.63 | – | – |
P14/ARF+P16/INK4α | Askari, 2013 (53) | India | 150/150 | – | MSP | – | 12.31 (7.31–20.72) | 0.004e |
BRCA1+H1N1+RASSF1A+CDH1+RARβ+APC+TWIST1+CyclinD2 | Cho, 2010 (24) | Turkey | 40/40 | 50.8/48.3 | Methylight | – | 2.00 | >0.05e |
BRIP1(I)+ESR1+SIRT3+ NUP155(I)+PITX2(I)+PITX2(II)+DCC+ZNF217(II)+FLJ39739+ PGR+TIMP3+CDH13+HSD17B4+PTGS2+SLC6A20+NEUROG1+HOXA1+TITF1+GDNF+ NEUROD1+SFRP1+MYOD1+ SYK+CYP1B1+5EZ6L | Widschwendter, 2008 (26) | Germany | 353/730 | 50–74/50–74 | Methylight | 0.628 | – | 0.048f |
Gene panel . | First author, year, ref. number . | Country . | Number cases/controls . | Age (year) cases/controls . | DNA methylation assay . | AUC . | ORa (95% CI) . | P value . |
---|---|---|---|---|---|---|---|---|
MGRN1+RAPSN+RPTOR | Tang, 2016 (23) | Germany | 109/102 | 46.6/42.6 | MassARRAY | 0.79 (0.73–0.85)b | – | – |
MGRN1+RAPSN+RPTOR | Tang, 2016 (23) | Germany | 189/189 | 59.6/59.1 | MassARRAY | 0.60 (0.54–0.66)c | – | – |
MGRN1+RAPSN+RPTOR | Tang, 2016 (23) | Germany | 270/250 | 44.3/44.8 | MassARRAY | 0.62 (0.57–0.67)d | – | – |
ANKRA2 + CDCA4 + ERCC1 + KM-NH-1 + NEK6 | Xu, 2013 (29) | USA | 25/56 | – | HM 27K | 0.63 | – | – |
P14/ARF+P16/INK4α | Askari, 2013 (53) | India | 150/150 | – | MSP | – | 12.31 (7.31–20.72) | 0.004e |
BRCA1+H1N1+RASSF1A+CDH1+RARβ+APC+TWIST1+CyclinD2 | Cho, 2010 (24) | Turkey | 40/40 | 50.8/48.3 | Methylight | – | 2.00 | >0.05e |
BRIP1(I)+ESR1+SIRT3+ NUP155(I)+PITX2(I)+PITX2(II)+DCC+ZNF217(II)+FLJ39739+ PGR+TIMP3+CDH13+HSD17B4+PTGS2+SLC6A20+NEUROG1+HOXA1+TITF1+GDNF+ NEUROD1+SFRP1+MYOD1+ SYK+CYP1B1+5EZ6L | Widschwendter, 2008 (26) | Germany | 353/730 | 50–74/50–74 | Methylight | 0.628 | – | 0.048f |
Abbreviation: ref., reference.
aAssociation between dichotomized methylation of gene panel and breast cancer.
bAUC for validation I.
cAUC for validation II.
dAUC for validation III.
eStatistical significance for OR.
fRepresents statistical significance for AUC value.
Eleven of the 18 studies estimated the associations with breast cancer for dichotomized methylation levels of 11 genes that were derived by four types of methylation techniques, including MSP, methylation-sensitive high-resolution melting (MS-HRM), pyrosequencing and methylight (MSP was the most frequently used method; Table 2). Significant findings reported from these studies are presented in Table 2. Seven markers (BRCA1, DBC2, P14ARF, P16INK4α, P21/CIP1, PEG3, PTEN) presented positive associations with breast cancer, with ORs ranged from 1.08 to 5.00. BRCA2 (49), CDH1 (43), CLOCK (44), and RARβ (43) showed negative associations with breast cancer, with ORs ranging from 0.23 to 0.65. Only 4 of the 11 studies provided information on covariates adjusted for in regression models (17, 36, 43, 44).
Seven of the 17 studies estimated the associations with breast cancer for quantitative methylation levels of 15 genes that were derived by four types of methylation techniques, including Illumina 450K, MASSARRAY, Pyrosequencing, and Methylight (16, 18, 20, 22, 23, 26, 27). ORs were estimated in four forms, including for lowest quintile versus highest quintile (16, 18), lowest quartile versus highest quartile (16, 23), for 10% lower methylation (16, 20, 22, 23, 26) and for methylation M value (per 1 SD; ref. 27). Significant results reported from these studies are presented in Table 3. Two studies conducted by Flanagan and colleagues (16) and Brennan and colleagues (18), respectively, both focused on gene body of ATM and found consistent strong associations of ATM methylation with breast cancer. In other four studies conducted by Joo and colleagues (27), Tang and colleagues (23), and Yang and colleagues (20, 22), the researchers used test-validation approaches, and their findings were all confirmed in their own validation steps. Of note, hypomethylation of three CpG sites in MGRN1, RAPSN, and RPTOR showed strong associations with breast cancer risk in three validation panels in Tang and colleagues' study, but these finding could not be replicated in a recent prospective study by Dugue and colleagues (46). In addition, Yang and colleagues (20, 22) found and validated strong associations of demethylation in HYAL2 and S100P with BC. Five of seven studies (18, 20, 22, 23, 26) adjusted for a limited set of covariates, including age, family history, study cohort, or measurement batches, and one study (16) did not provide information regarding covariates adjustment.
Diagnostic performance
An overview of the diagnostic performance of whole-blood DNA methylation markers is shown in Fig. 1 and Supplementary Table S4. Most markers demonstrated very limited discriminative capacity. In seven studies (17, 19, 32, 33, 43, 45, 47) assessing BRCA1′s performance, sensitivity ranged from 10% to 43% at specificity of approximately 85% to 95%. HYAL2 from Yang and colleagues' study (22) showed the best discriminative performance: at a specificity of 90%, the sensitivity was 58.50% (Validation I), and 63.88% (Validation II), in two validations (AUCs were 0.83 and 0.89, respectively). Another promising candidate reported by Yang and colleagues is S100P (20), with sensitivity of 71.60% at specificity of 76.60% and AUC of 0.81(95% CI, 0.77–0.86) in one of the validation samples, unfortunately showed less promising results in two other validation sets [AUC 0.68 (95% CI, 0.62–0.73) and 0.70 (95% CI, 0.64–0.75)]. Three other markers, including DBC2 (31), PTEN (21), and P16INK4α (42), showed relatively modest discriminative performance, with sensitivity of >40% at specificity of ≥75%.
Graphical representation of sensitivity versus specificity of DNA methylation markers. S100P Validation I.
HYAL2 Validation I (at specificity of 90%).
HYAL2 Validation II (at specificity of 90%).
P16INK4α.
DBC2.
BRCA1 (multiple studies, each study in a different color).
PTEN.
Markers with poor diagnostic performance. The sensitivity is plotted on the y-axis, whereas the false positivity rate is presented (100-Specificity) on the x-axis. The sensitivity of HYAL2 was estimated by the ROC curve reported in Yang et al. (22) at a specificity of 90%. Markers with a reported sensitivity >40%, along with a specificity of >75%, are depicted by individual symbols, and all the other markers with a relatively poor diagnostic performance are assigned the same symbol.
Graphical representation of sensitivity versus specificity of DNA methylation markers. S100P Validation I.
HYAL2 Validation I (at specificity of 90%).
HYAL2 Validation II (at specificity of 90%).
P16INK4α.
DBC2.
BRCA1 (multiple studies, each study in a different color).
PTEN.
Markers with poor diagnostic performance. The sensitivity is plotted on the y-axis, whereas the false positivity rate is presented (100-Specificity) on the x-axis. The sensitivity of HYAL2 was estimated by the ROC curve reported in Yang et al. (22) at a specificity of 90%. Markers with a reported sensitivity >40%, along with a specificity of >75%, are depicted by individual symbols, and all the other markers with a relatively poor diagnostic performance are assigned the same symbol.
DNA methylation panels
Combinations of individual methylation markers were evaluated in only five studies (Table 4). The marker panels seem to enhance diagnostic efficacy. For example, in Askari and colleagues' study (42), ORs (95% CI) for individual associations of dichotomous classification of P14/ARF and P16/INK4α with breast cancer were 1.99 (1.68–2.36) and 2.14 (1.82–2.50), respectively, and the OR for presence of both markers versus absence of both markers was 12.31 (7.31–20.72). However, improved performance of combining multiple markers was not observed in Tang and colleagues' study (23): AUCs for a combination of MGRN1, RAPSN, and RPTOR were 0.79 (95% CI, 0.73–0.85), 0.60 (95% CI, 0.54–0.66), and 0.62 (95% CI, 0.57–0.67) in three sets of validation.
Discussion
In this systematic literature review, we identified 33 studies assessing whole-blood based DNA methylation markers for its potential diagnostic value for breast cancer. The identified studies investigated both gene-specific and methylation array-based markers with 10 types of methylation assays. Only a few markers showed significant associations with breast cancer in either univariate or multivariate regression analysis, and the diagnostic efficacy of these markers was relatively modest, with only six markers (BRAC1, DBC2, HYAL2- cg27091787, P16NK4α, PTEN, and S100P- cg22266967) showing sensitivity >40% at specificity >75% [only two (HYAL2 and S100P) were independently validated]. Of note, diagnostic performance for all of the markers, especially for those derived from epigenome-wide association studies (EWAS) are yet to be validated in larger independent samples and therefore need to be interpreted with caution.
Compared with another type of epigenetic markers, miRNA detected in serum or plasma for its potential use in breast cancer diagnosis (50), the performance of DNA methylation markers reviewed in this study appears to be less promising. If setting the threshold of specificity at ≥90%, a threshold commonly required for screening, only for HYAL2 a sensitivity >50% was reported in one study (22). The so far relatively modest diagnostic performance of identified markers may primarily result from the experimental methods applied to determine methylation levels in the previous studies. Although DNA methylation analysis methods have undergone most rapid developments with revolutionary changes (51), the vast majority of included studies utilized qualitative or semiquantitative assays, or quantitative methods that can only analyze DNA methylation within particular regions of specific genes (21, 38, 41, 42, 43–45, 47, 49, 52, 53). For example, in addition to very limited coverage of CpGs of these methods, the MSP can only assess one or two CpGs at a time (54, 55). Furthermore, simple dichotomization of methylation levels either by the measurement itself (e.g., MSP) or by defining an arbitrary cut-off (e.g., MS-HRM) in the statistical analysis often led to substantial loss of information. Even though microarray- or sequencing-based high-throughput technologies have been available for approximately 10 years (56), only six included studies derived candidate markers using epigenome-wide analysis (20, 22, 23, 27, 35, 46). Reasons for limited use of epigenome-wide analysis may include their high per-sample cost, and sample size limitation in the few such studies may have limited power in epigenome-wide analyses that require rigorous adjustment for correction for extensive multiple testing. Of note, the most promising marker identified so far, HYAL2, was derived from the Illumina 27K analysis rather than then Illumina 450K analysis. However, the ever-advancing technologies with much more extensive coverage of CpGs across the genome, such as the Illumina EPIC assay (57), bear promising new opportunities for mining and deriving novel methylation markers. In the context of large-scale consortia combining data from multiple studies in a similar manner as most successfully done for genome-wide association studies, further consortia efforts should take advantage of the new analysis methods to identify informative methylation markers for breast cancer diagnosis.
In addition to variability in methylation assays, heterogeneity of the included studies in study populations (including age, ethnics, sample size, and subtypes of breast cancer cases) and in methodology of estimating and reporting the associations with breast cancer hindered direct comparison of the investigated methylation markers. These heterogeneities also partly explain the inconsistency or conflicting findings of studies evaluating the same methylation markers. For example, there were seven studies from six Western countries (22, 32, 43, 45, 47, 49, 52) and four studies from three Eastern countries (17, 19, 24, 33) evaluating BRCA1 methylation. The number of the breast cancer cases in those studies ranged from 7 to 1021, and the age of the cases ranged from 26 to 89. Five types of methylation assays (including non-quantitative and quantitative methods) were applied in the 11 studies. Of five of the 11 studies that found significant associations, two studies (22, 32) only reported P < 0.05 for testing difference of methylation levels between cases and controls, other three studies (17, 45, 47) reported ORs (ranging from 1.7 to 5.0) for dichotomized measures, and only one study (17) reported information on confounder adjustment in risk estimation. In addition, breast cancer is a genetically and histologically heterogeneous disease (58), and subtype-specific methylation profiles in breast cancer tissue have also been identified (59). Although most included studies enrolled all types of breast cancer cases, several studies focused on specific subgroups of breast cancer, such as in situ or invasive breast cancer only (43), triple-negative and/or medullary breast cancer (47), or sporadic breast cancer only (52). The heterogeneous nature of the included studies precluded us from conducting a meta-analysis to provide summary estimates of diagnostic performance for the investigated markers. Future studies should pay particular attention to selecting cases (with respect to study size, patient, and tumor characteristics), applying careful matching and/or adjustments for potential confounders in case–control studies or case–control studies nested within prospective cohort, utilizing accurate quantitative methylation assay, and analyzing data by sophisticated statistical methods, in order to identify truly relevant methylation markers for breast cancer.
Rigorous independent validation is crucial to derive reliable genetic and epigenetic makers or marker panels, particularly for candidates discovered in genome-wide approaches (57, 60). In only eight of the studies validations in independent samples were performed. In the study conducted by Yang and colleagues (20, 22) and Tang and colleagues (23), validation was even performed in two to three independent samples. Of note, in Yang and colleagues' study (20), relative good performance of S100P was observed in only one validation sample but not in the other two validation samples. This highlights the necessity of external validation in good-quality studies, preferably in studies with blood samples prospectively collected prior to disease diagnosis. Notably, most included studies evaluated methylation markers with patients who already had breast cancer diagnosis such that the altered methylation levels might result from the disease process. The timing of blood sampling may be one of the reasons why three CpG sites in MGRN1, RAPSN, and RPTOR which showed strong associations with breast cancer in Tang and colleagues' study (23) presented with null associations in Dugue and colleagues' nested case–control study in which DNA methylation were measured in blood samples collected before disease manifested (46). Although methylation changes resulting from or manifesting after breast cancer manifestation might still be of some value for breast cancer early detection, validation in prospective settings is indispensable for markers supposed to be used for risk stratification regarding breast cancer development.
In comparison to serum or plasma analyses, which contain limited amount cell-free DNA with short half-life and high sensitivity to processing (61), DNA methylation analyses based on whole blood samples has several advantages, including easier accessibility and processing, as well as sufficiency and stability of blood DNA (62). Because whole blood DNA presents a mixture of leukocyte subtypes and methylation patterns may vary between leukocyte subtypes, a major concern has always been raised that the derived methylation markers may at least partly reflect shifts in the leukocyte subtypes (63). Nevertheless, the potential confounding effect can be controlled by adjustment for WBC counts or estimated WBC distribution in data analysis (20, 22). A very well established estimation of leukocyte subtypes has been widely used in this field (64). Furthermore, even if blood DNA methylation pattern is partly driven by leukocyte distribution, this may not invalidate the use of methylation markers in breast cancer diagnosis.
Conclusions
Over the last decade, many efforts have been taken to investigate the association between methylation signatures and breast cancer. Although, so far established blood DNA methylation markers are insufficient for breast cancer early detection but some of them may hold potential for breast cancer risk stratification. Nevertheless, the field still looks nascent. Apart from concerns about publication bias, the vast previous studies were retrospective in nature with methylation markers being measured after the onset of the disease and may have been influenced by the disease process. Thus previous reports of aberrant DNA methylation need to be validated in prospective, high-quality studies and tested in large screening populations by utilizing precise, quantitative methods. Advanced new technologies for epigenome-wide methylation analyses, when applied in sufficiently large study samples, should facilitate the identification of truly relevant methylation markers or panels for breast cancer risk prediction. Furthermore, because the combination of multiple SNPs has shown its capacity in discriminating women with different risk for breast cancer, combining methylation-based risk scores and polygenetic risk score may provide even more accurate risk prediction and facilitate identification of women at increased risk and may thereby contribute to enhanced personalized, risk adapted screening strategies in the future.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: K. Cuk, Y. Zhang, H. Brenner
Development of methodology: Z. Guan, Y. Zhang
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): Z. Guan, H. Yu
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): Z. Guan, K. Cuk, H. Brenner
Writing, review, and/or revision of the manuscript: Z. Guan, K. Cuk, Y. Zhang, H. Brenner
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): Z. Guan
Study supervision: Y. Zhang, H. Brenner
Acknowledgments
This work was supported by China Scholarship Council (Grant No.: 201606260041; to Z. Guan).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.