Ovarian cancer is a leading cause of death for women worldwide, in part due to ineffective screening methods. In this study, we used whole-genome cell-free DNA (cfDNA) fragmentome and protein biomarker [cancer antigen 125 (CA-125) and human epididymis protein 4 (HE4)] analyses to evaluate 591 women with ovarian cancer, with benign adnexal masses, or without ovarian lesions. Using a machine learning model with the combined features, we detected ovarian cancer with specificity >99% and sensitivities of 72%, 69%, 87%, and 100% for stages I to IV, respectively. At the same specificity, CA-125 alone detected 34%, 62%, 63%, and 100%, and HE4 alone detected 28%, 27%, 67%, and 100% of ovarian cancers for stages I to IV, respectively. Our approach differentiated benign masses from ovarian cancers with high accuracy (AUC = 0.88, 95% confidence interval, 0.83–0.92). These results were validated in an independent population. These findings show that integrated cfDNA fragmentome and protein analyses detect ovarian cancers with high performance, enabling a new accessible approach for noninvasive ovarian cancer screening and diagnostic evaluation.

Significance:

There is an unmet need for effective ovarian cancer screening and diagnostic approaches that enable earlier-stage cancer detection and increased overall survival. We have developed a high-performing accessible approach that evaluates cfDNA fragmentomes and protein biomarkers to detect ovarian cancer.

Ovarian cancer is a leading cause of death in women worldwide, with more than 300,000 new cases and nearly 200,000 deaths globally each year (1). In the United States during 2024, approximately 19,600 new cases will be diagnosed and 12,700 women will die from ovarian cancer (2). The most common form of ovarian cancer is epithelial ovarian cancer, which comprises four major subtypes: serous, clear cell, mucinous, and endometrioid carcinomas. According to the Surveillance, Epidemiology, and End Results database, for individuals with detected invasive epithelial ovarian cancer, the estimated 5-year survival is 93% and 75% for localized (stage I) or regional (stage II or stage IIIA1 with regional lymph node involvement) disease, respectively, compared with 31% for distant disease (remaining stage III or stage IV; refs. 3, 4). Unfortunately, ovarian cancer is usually detected in advanced stages (stages III and IV) due to nonspecific clinical symptoms at earlier stages and the lack of an effective screening approach (3). Consequently, there is a clear unmet clinical need for the development of highly specific and sensitive assays to detect ovarian cancer in its earliest stages.

Ovarian cancer screening trials such as the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial (5), the U.K. Collaborative Trial of Ovarian Cancer Screening (UKCTOCS; ref. 6), and the Normal Risk Ovarian Screening Study (ref. 7) have shown that existing biomarkers, including cancer antigen 125 (CA-125), may provide a shift toward detection of earlier stages of cancer but not a survival benefit, likely because of suboptimal detection of ovarian cancers. These analyses open the door to new and more effective approaches aimed at identifying combinations of biomarkers with improved performance for early ovarian cancer detection. Such approaches would need to be affordable, accessible, and have high performance for high-grade serous ovarian carcinoma (HGSOC), which is more aggressive, typically detected in late stages, and responsible for the majority of ovarian cancer deaths (8).

A secondary clinical need also exists in determining whether women presenting with ovarian masses have benign or malignant lesions. In this setting, preoperative malignancy classification is challenging and may lead to unnecessary procedures. A number of biomarkers have been proposed in this setting, including CA-125 and human epididymis protein 4 (HE4; refs. 911). Prediction models using a combination of multiple protein biomarkers as well as age and menopausal status (12), the risk of malignancy index (ref. 13), and other ultrasound classifications (International Ovarian Tumor Analysis; ref. 14) have been developed, but these vary in accuracy, performance, and ease of use in a clinical setting.

Analyses of circulating cell-free DNA (cfDNA) provide another approach for early cancer detection in the screening or diagnostic settings. Approaches for ovarian cancer have included identification of tumor-specific mutations (15, 16), or alterations in DNA methylation (17), or specific repeat sequences (18, 19); however, these approaches have had limited sensitivities for early-stage disease, may be confounded by alterations in white blood cells (20), and have not been validated for clinical use. An emerging approach of cfDNA analyses have focused on the “cfDNA fragmentome,” defined as the genome-wide compendium of cfDNA fragments in the circulation, providing an integrated view of the chromatin, genome, epigenome, and transcriptome states of normal and cancer cells of an individual. Recent cfDNA fragmentome analyses using low-coverage whole-genome sequencing (WGS) combined with machine learning using DNA evaluation of fragments for early interception (DELFI) have demonstrated high sensitivity for early detection across lung (21), liver (22), and other cancer types (2326) using an accessible, cost-efficient approach (27) that is not confounded by clonal hematopoiesis (20, 28).

In this study, we present a method to detect ovarian cancer using cfDNA fragmentomes combined with protein biomarkers. This multianalyte combination has the benefit of utilizing genome-wide multifeature fragmentation analyses together with complementary protein biomarkers CA-125 and HE4 from the same blood draw that may have utility in both the screening and diagnostic settings.

Clinical Cohorts

Blood samples in the discovery cohort were collected from women with ovarian cancer (n = 94), with benign adnexal masses (n = 203), or without any known ovarian lesions (n = 182), who were part of previously reported prospective diagnostic or screening efforts at hospitals in the Netherlands and Denmark (Table 1; Supplementary Table S1; refs. 9, 21, 23, 29). For the validation cohort, we analyzed samples from patients prospectively collected at the University of Pennsylvania or through a commercial source in the United States (n = 40 patients with ovarian cancer, n = 50 patients with benign ovarian masses, and n = 22 without known ovarian lesions; Table 1; Supplementary Table S1). The patients analyzed were largely representative of ovarian cancer subtypes, including high-grade serous (HGSOC), low-grade serous (LGSOC), clear cell, mucinous, and endometrioid ovarian cancers, across all International Federation of Gynecology and Obstetrics (FIGO) stages (Table 1).

Table 1.

Patient characteristics for discovery and validation cohorts.

Patient characteristicDiscovery cohortValidation cohort
NoncancerCancerBenign massNoncancerCancerBenign mass
 182 94 203 22 40 50 
Age, years       
 Mean 56 62 58 59 53 52 
 Range 49–75 37–85 19–92 34–84 13–85 16–83 
Cancer stage, n (%)       
 I — 32 (34%) — — 14 (35%) — 
 II — 26 (28%) — — 5 (13%) — 
 III — 30 (32%) — — 11 (28%) — 
 IV — 2 (2%) — — 6 (15%) — 
 Unknown — 4 (4%) — — 4 (10%) — 
Cancer subtype, n (%)       
 High-grade serous — 39 (41%) — — 16 (40%) — 
 Low-grade serous — 7 (7%) — — 5 (13%) — 
 Clear cell — 11 (12%) — — 2 (5%) — 
 Mucinous — 12 (13%) — — 5 (13%) — 
 Endometrioid — 14 (15%) — — 5 (13%) — 
 Other — 11 (12%) — — 7 (18%) — 
BRCA mutation status, n (%)       
 Positive — 1 (1%) — — 3 (7.5%) 1 (2%) 
 Negative — 12 (13%) — — 8 (20%) 10 (20%) 
 Not tested — — — — 8 (20%) 39 (78%) 
 Unknown 182 (100%) 81 (86%) 203 (100%) 22 (100%) 21 (52.5%) — 
Benign lesion, n (%)       
 None 182 (100%) — — 22 (100%) — — 
 Cystadenoma/adenofibroma–serous — — 27 (13%) — — 17 (34%) 
 Cystadenoma/adenofibroma–mucinous   31(15%)   6 (12%) 
 Cystadenoma/adenofibroma–NA — — 7 (3%) — — 2 (4%) 
 Endometriosis — — 11 (5%) — — 2 (4%) 
 Mature teratoma of the ovary — — 8 (4%) — — 4 (8%) 
 Ovarian fibroma — — 7 (3%) — — 4 (8%) 
 Thecoma — — 4 (2%) — — 3 (6%) 
 Other — — 20 (10%) — — 12 (24%) 
 Unknown — — 88 (43%) — — — 
Patient characteristicDiscovery cohortValidation cohort
NoncancerCancerBenign massNoncancerCancerBenign mass
 182 94 203 22 40 50 
Age, years       
 Mean 56 62 58 59 53 52 
 Range 49–75 37–85 19–92 34–84 13–85 16–83 
Cancer stage, n (%)       
 I — 32 (34%) — — 14 (35%) — 
 II — 26 (28%) — — 5 (13%) — 
 III — 30 (32%) — — 11 (28%) — 
 IV — 2 (2%) — — 6 (15%) — 
 Unknown — 4 (4%) — — 4 (10%) — 
Cancer subtype, n (%)       
 High-grade serous — 39 (41%) — — 16 (40%) — 
 Low-grade serous — 7 (7%) — — 5 (13%) — 
 Clear cell — 11 (12%) — — 2 (5%) — 
 Mucinous — 12 (13%) — — 5 (13%) — 
 Endometrioid — 14 (15%) — — 5 (13%) — 
 Other — 11 (12%) — — 7 (18%) — 
BRCA mutation status, n (%)       
 Positive — 1 (1%) — — 3 (7.5%) 1 (2%) 
 Negative — 12 (13%) — — 8 (20%) 10 (20%) 
 Not tested — — — — 8 (20%) 39 (78%) 
 Unknown 182 (100%) 81 (86%) 203 (100%) 22 (100%) 21 (52.5%) — 
Benign lesion, n (%)       
 None 182 (100%) — — 22 (100%) — — 
 Cystadenoma/adenofibroma–serous — — 27 (13%) — — 17 (34%) 
 Cystadenoma/adenofibroma–mucinous   31(15%)   6 (12%) 
 Cystadenoma/adenofibroma–NA — — 7 (3%) — — 2 (4%) 
 Endometriosis — — 11 (5%) — — 2 (4%) 
 Mature teratoma of the ovary — — 8 (4%) — — 4 (8%) 
 Ovarian fibroma — — 7 (3%) — — 4 (8%) 
 Thecoma — — 4 (2%) — — 3 (6%) 
 Other — — 20 (10%) — — 12 (24%) 
 Unknown — — 88 (43%) — — — 

For all participants, we isolated plasma, extracted cfDNA, created genomic libraries, and performed next-generation WGS of cfDNA fragments at ∼2× coverage. An average of 3 mL of plasma per sample was used, and all samples were successfully processed, without any sample or technical failures (Supplementary Tables S2 and S3). For all patients, we quantified levels of CA-125 and HE4 using clinical-grade immunoassay measurements from the same blood samples that were used for genomic analyses or from serum samples of the same patients (Supplementary Table S4).

cfDNA Fragmentomes Reveal Tumor-Specific Changes in Ovarian Cancer

We evaluated cfDNA fragmentation profiles that captured fragment size and coverage distributions in 473 nonoverlapping genome-wide 5-Mb regions, covering 2.4 Gb of the genome (Fig. 1; refs. 21, 22). Fragmentation profiles were homogenous among individuals without cancer or showed limited changes in individuals with benign adnexal masses (Fig. 2A). In contrast, fragmentation profiles from patients with cancer showed marked heterogeneity both between patients and across different regions of the genome for the same individual, consistent with changes in chromatin landscapes that affect cfDNA fragmentation (Fig. 2).

Figure 1.

Schematic of ovarian cancer detection in screening and diagnostic models combining DELFI and protein biomarkers. Individuals undergo blood collection, plasma is extracted, and constructed genomic libraries undergo WGS at low coverage (∼2×). Using blood samples from the same collection, proteins are quantified enabling the combined assessment of genome-wide fragmentation profiles and protein biomarkers. These features are evaluated in a machine learning model that classifies cancer and noncancer individuals.

Figure 1.

Schematic of ovarian cancer detection in screening and diagnostic models combining DELFI and protein biomarkers. Individuals undergo blood collection, plasma is extracted, and constructed genomic libraries undergo WGS at low coverage (∼2×). Using blood samples from the same collection, proteins are quantified enabling the combined assessment of genome-wide fragmentation profiles and protein biomarkers. These features are evaluated in a machine learning model that classifies cancer and noncancer individuals.

Close modal
Figure 2.

Characteristics of cfDNA fragmentation for ovarian cancer detection. A, Fragmentation profiles in which each line represents one participant and is colored according to that participant’s correlation to the median genome-wide profile for women without cancer. B, Heatmap of fragmentation and protein features show marked heterogeneity in the cfDNA fragmentome among individuals with ovarian cancer compared with those with benign lesions or without disease. In the heatmap, individuals are split into disease groups and then successively ordered by DELFI-Pro, CA-125, and HE4, and cancers are categorized according to stage and subtype. Fragmentation features are clustered in columns. The top bar indicates the feature family containing the short to long ratio of fragment sizes (ratio) and chromosomal arm representation (z-scores). C, Chromosomal gains (red) and losses (blue) characteristic of ovarian cancer tumor tissue evaluated in TCGA were observed in cfDNA fragmentation data in patients with ovarian cancer and absent from those with benign lesions or without disease (red represents gains, whereas losses are blue, and purple indicates no changes in chromosomal representation, respectively). D, Feature importance, as measured by scaled coefficients from the PLR locked screening model for ovarian cancer, demonstrates contributions of cfDNA fragmentation (fragment length and aneuploidy) and proteins (CA-125 and HE4) to high performance.

Figure 2.

Characteristics of cfDNA fragmentation for ovarian cancer detection. A, Fragmentation profiles in which each line represents one participant and is colored according to that participant’s correlation to the median genome-wide profile for women without cancer. B, Heatmap of fragmentation and protein features show marked heterogeneity in the cfDNA fragmentome among individuals with ovarian cancer compared with those with benign lesions or without disease. In the heatmap, individuals are split into disease groups and then successively ordered by DELFI-Pro, CA-125, and HE4, and cancers are categorized according to stage and subtype. Fragmentation features are clustered in columns. The top bar indicates the feature family containing the short to long ratio of fragment sizes (ratio) and chromosomal arm representation (z-scores). C, Chromosomal gains (red) and losses (blue) characteristic of ovarian cancer tumor tissue evaluated in TCGA were observed in cfDNA fragmentation data in patients with ovarian cancer and absent from those with benign lesions or without disease (red represents gains, whereas losses are blue, and purple indicates no changes in chromosomal representation, respectively). D, Feature importance, as measured by scaled coefficients from the PLR locked screening model for ovarian cancer, demonstrates contributions of cfDNA fragmentation (fragment length and aneuploidy) and proteins (CA-125 and HE4) to high performance.

Close modal

Ovarian tumors are known for having marked large-scale genomic changes (2932). As cfDNA fragmentomes may reflect large-scale genomic alterations contained in DNA fragments released from tumor cells, we also examined chromosomal copy-number changes in the circulation of these individuals. In addition to changes in genome-wide cfDNA fragmentation (Fig. 2A and B), we observed chromosomal gains and losses consistent with those expected from prior analyses of ovarian tumors in The Cancer Genome Atlas (TCGA; n = 597; ref. 30) as well as from genomic analyses of early ovarian cancer precursors (32), including gains of 3q, 8q, 12p, 20p, and 20q and losses of 4q, 5q, 6q, 8p, 13q, 17p, and 22q (Fig. 2C). These gains and losses were not observed in individuals without cancer or with benign adnexal masses, consistent with the notion that although ovarian tumors and benign lesions may share similar anatomic locations, the observed changes in cfDNA were cancer-specific.

Detection of Ovarian Cancer Using cfDNA Fragmentome and Protein Analyses

Given the concordance between genomic changes and cfDNA fragmentation in ovarian cancer, we applied a machine learning approach to ascertain if alterations in cfDNA fragmentomes could distinguish individuals in the discovery cohort with ovarian cancer from those without ovarian lesions. The model incorporated genome-wide fragmentation profiles, chromosomal arm–level changes, and the concentrations of protein biomarkers CA-125 and HE4. We previously utilized similar approaches to construct high-performance classifiers for lung and liver cancer detection that were externally validated (21, 22). These approaches utilized penalized logistic regression (PLR) due to its parsimonious model architecture, interpretability, and robustness to overfitting. In this study, we determined the performance of this classifier using repeated 5-fold cross-validation, producing a score for each patient as an average of 10 cross-validation repeats [DELFI protein (DELFI-Pro) score; Supplementary Table S5]. The DELFI-Pro classifier utilized both fragmentomic features and proteomic measurements for detecting individuals with ovarian cancer (Fig. 2D). The DELFI-Pro classifier employed a PLR model to retain only the most informative features, including fragmentation characteristics reflecting chromatin and chromosomal changes alongside conventional protein biomarkers.

Because clinical characteristics can influence biomarker profiles evident in the circulation, we examined the relationship between the DELFI-Pro score and demographic parameters such as age or common comorbidities such as diabetes, hypertension, or atherosclerosis in individuals without ovarian disease for whom this information was available. We observed either no or limited association between DELFI-Pro scores and these conditions, although this conclusion was limited by incomplete availability of clinical information (Supplementary Fig. S1A–S1C).

We then evaluated the relationship between DELFI-Pro scores and the presence and stage of ovarian cancer. The cross-validated DELFI-Pro scores, spanning a possible range from 0 to 1, for 182 women who were free of ovarian disease were low, with median scores of 0.07. In contrast, women with ovarian cancers had significantly higher median scores across all stages, including stage I = 0.93, stage II = 0.93, stage III = 1.00, and stage IV = 1.00 (P < 0.0001 across all tumor stages, Wilcoxon rank-sum test, Fig. 3A). Scores did not differ by age (P = 0.95, Pearson correlation test) and were not different among women with cancer who were symptomatic or asymptomatic (P = 0.61, Wilcoxon signed-rank test) or who were pre- or post-menopausal (P = 0.36, Wilcoxon signed-rank test; Supplementary Fig. S2A–S2C).

Figure 3.

DELFI-Pro detects ovarian cancer with high sensitivity and specificity. A, In the discovery cohort, patients with ovarian cancer across all stages have elevated DELFI-Pro scores in HGSOC as well as other ovarian subtypes. B, ROC analyses of the discovery cohort show high performance across stages and in HGSOC. C and D, The locked DELFI-Pro model at locked thresholds (e.g., for 99% specificity, DELFI-Pro score >0.66) showed similar performance in the validation cohort.

Figure 3.

DELFI-Pro detects ovarian cancer with high sensitivity and specificity. A, In the discovery cohort, patients with ovarian cancer across all stages have elevated DELFI-Pro scores in HGSOC as well as other ovarian subtypes. B, ROC analyses of the discovery cohort show high performance across stages and in HGSOC. C and D, The locked DELFI-Pro model at locked thresholds (e.g., for 99% specificity, DELFI-Pro score >0.66) showed similar performance in the validation cohort.

Close modal

DELFI-Pro detected patients with ovarian cancer with an AUC of 0.96 [95% confidence interval (CI), 0.93–0.99; Fig. 3B]. Among early-stage ovarian cancers, performance remained robust, with AUCs of 0.96 (95% CI, 0.92–0.99) and 0.94 (95% CI, 0.87–1.00) for stages I (n = 32) and II (n = 26), respectively (Fig. 3B). Individuals with advanced-stage [stages III (n = 30) and IV (n = 2)] ovarian cancer were detected with high sensitivity among the individuals analyzed [AUCs 0.99 (95% CI, 0.98–1.00) and 1.00 (95% CI, 1.00–1.00), respectively]. Stability analyses of the cross-validated model revealed highly consistent DELFI-Pro scores for noncancers and cancers regardless of the held-out fold or source of sample collection (Supplementary Fig. S3A and S3B). High performance was observed among patients with HGSOC (n = 39), with an AUC = 0.99 (95% CI, 0.99–1.00), as well as in other ovarian cancers, including LGSOC (n = 7), endometrioid (n = 14), mucinous (n = 12), clear cell (n = 11), or other (n = 11) subtypes [AUCs of 0.99 (95% CI, 0.98–1.00), 0.97 (95% CI, 0.94–1.00), 0.94 (95% CI, 0.88–1.00), 0.84 (95% CI, 0.65–1.00), and 0.96 (95% CI, 0.87–1.00), respectively; Supplementary Fig. S4]. High performance was also observed when assessing only individuals who were asymptomatic [AUC 0.99 (95% CI, 0.97–1); Supplementary Fig. S5A and S5B]. Other genome-wide analyses, such as ichorCNA, which only includes copy-number changes, and analyses of overall median cfDNA fragment lengths provided substantially weaker performance, with overall AUCs of 0.71 (95% CI, 0.64–0.78) and 0.59 (95% CI, 0.52–0.66), respectively (Supplementary Fig. S6A–S6D).

Given the low incidence of ovarian cancer (10.3 of 100,000 age-adjusted women in the U.S. population; ref. 33), any screening test would need to have high specificity in order to give a high positive predictive value (PPV) and minimize the absolute number of false positive results leading to potentially unnecessary procedures or prolonged diagnostic odysseys. At a specificity >99%, the cross-validated sensitivity in this setting was 72%, 69%, 87%, and 100% for stages I to IV, respectively (Table 2). HGSOCs typically had high DELFI-Pro scores, with 90% detected at this threshold (83%, 88%, 91%, and 100% for stages I–IV, respectively). Analysis of CA-125 alone in this population revealed a significantly lower fraction that was detected, especially [34%, 62%, 63%, and 100% of ovarian cancers for stages I–IV (P = 0.001, two-sided test of equal proportions] at the same specificity (Supplementary Fig. S7).

Table 2.

Performance of the DELFI-Pro screening model and protein biomarkers for detection of ovarian cancer.a

Discovery cohortValidation cohort
Individuals analyzedNSensitivity at >99% specificityNSpecificity at the locked thresholdaSensitivity at the locked threshold
DELFI-ProCA-125HE4DELFI-ProCA-125HE4
Noncancer 182 — — — 22 100% (89%–100%) — — — 
Ovarian cancer 94 77% (69%–83%) 53% (45%–61%) 42% (34%–50%) 40 — 73% (60%–82%) 60% (47%–72%) 40% (28%–53%) 
HGSOC 39 90% (79%–95%) 72% (59%–82%) 64% (51%–75%) 16 — 81% (61%–92%) 69% (48%–84%) 63% (42%–79%) 
Stage I 32 72% (58%–83%) 34% (22%–49%) 28% (17%–43%) 14 — 71% (49%–87%) 57% (36%–76%) 36% (19%–57%) 
Stage II 26 69% (53%–82%) 62% (46%–75%) 27% (15%–43%) — 80% (44%–98%) 60% (27%–86%) 20% (2%–57%) 
Stage III 30 87% (73%–94%) 63% (48%–76%) 67% (52%–79%) 11 — 73% (48%–89%) 64% (39%–83%) 36% (18%–61%) 
Stage IV 100% (43%–100%) 100% (43%–100%) 100% (43%–100%) — 83% (50%–98%) 67% (35%–88%) 67% (35%–88%) 
Discovery cohortValidation cohort
Individuals analyzedNSensitivity at >99% specificityNSpecificity at the locked thresholdaSensitivity at the locked threshold
DELFI-ProCA-125HE4DELFI-ProCA-125HE4
Noncancer 182 — — — 22 100% (89%–100%) — — — 
Ovarian cancer 94 77% (69%–83%) 53% (45%–61%) 42% (34%–50%) 40 — 73% (60%–82%) 60% (47%–72%) 40% (28%–53%) 
HGSOC 39 90% (79%–95%) 72% (59%–82%) 64% (51%–75%) 16 — 81% (61%–92%) 69% (48%–84%) 63% (42%–79%) 
Stage I 32 72% (58%–83%) 34% (22%–49%) 28% (17%–43%) 14 — 71% (49%–87%) 57% (36%–76%) 36% (19%–57%) 
Stage II 26 69% (53%–82%) 62% (46%–75%) 27% (15%–43%) — 80% (44%–98%) 60% (27%–86%) 20% (2%–57%) 
Stage III 30 87% (73%–94%) 63% (48%–76%) 67% (52%–79%) 11 — 73% (48%–89%) 64% (39%–83%) 36% (18%–61%) 
Stage IV 100% (43%–100%) 100% (43%–100%) 100% (43%–100%) — 83% (50%–98%) 67% (35%–88%) 67% (35%–88%) 
a

The analyses in the discovery and validation cohorts were performed at the locked thresholds for DELFI-Pro of 0.66, CA-125 of 128.5 U/mL, and HE4 of 212.8 pmol/L, all corresponding to a specificity >99% in the discovery cohort. For each value, the 90% CIs are indicated in the parentheses.

In addition to the cross-validated analysis of the discovery cohort of European (EU) patients, we evaluated the locked DELFI-Pro classifier in a validation cohort of 62 patients from the United States. The validation cohort included patients across different ovarian cancer subtypes as well as individuals without ovarian cancer (Supplementary Table S1). Similar to the observations from the discovery cohort, the fragmentation profiles of women without ovarian cancer in the validation cohort were highly consistent across the genome, whereas patients with ovarian cancer were heterogenous (Supplementary Fig. S8). The chromosomal changes observed in the cfDNA of the U.S. validation cohort patients resembled those observed in the EU discovery group and in ovarian tumor tissue from TCGA (Supplementary Fig. S9A–S9C). The DELFI-Pro model detected patients with cancer in the validation cohort with high performance (AUC = 0.93, 95% CI, 0.87–1.00), including patients with HGSOC (AUC = 1.00, 95% CI, 1.00–1.00). At a fixed score threshold selected to achieve >99% specificity in the discovery cohort, we detected 73% ovarian cancers overall and 81% of HGSOC (Fig. 3C and D; Supplementary Fig. S10A–S10C; Supplementary Table S6). These results revealed the shared biological features of cfDNA fragmentation across cohorts and demonstrated the robustness and generalizability of DELFI-Pro in the detection of ovarian cancer in different populations.

Distinguishing Ovarian Cancer from Benign Masses

We examined whether our approach could be useful in a diagnostic setting for distinguishing between patients with ovarian cancer and those with benign adnexal masses, which can be difficult to differentiate clinically using ultrasound-based prediction models. We observed that genome-wide fragmentation profiles were different between patients with cancer compared with those with benign lesions (Fig. 2A). We trained and cross-validated a DELFI-Pro machine learning model in the EU cohort to distinguish ovarian cancers from benign lesions. This machine learning model was similar to that developed for the screening setting, with the rank-ordered DELFI-Pro scores being highly correlated across all discovery cohort ovarian cancer samples (n = 94; R = 0.78, P < 2.2e−16; Supplementary Fig. S11). Using this model, patients with benign lesions had median scores of 0.17, whereas patients with cancer had a stage-dependent increase in DELFI-Pro scores. Individuals with benign lesions had similar low scores regardless of lesion size or whether the patient was symptomatic or asymptomatic (Supplementary Fig. S12A and S12B). The model had strong performance in identifying patients with cancer as compared with those with benign lesions, with a ROC AUC of 0.88 (95% CI, 0.83–0.92), ranging from 0.82 (95% CI, 0.74–0.90) to 1.00 (95% CI, 1.00–1.00) for stages I to IV. Patients with HGSOC, LGSOC, or endometrioid cancers were more easily distinguished from benign lesions [AUCs of 0.96 (95% CI, 0.93–1.00), 0.84 (95% CI, 0.67–1.00), and 0.91 (95% CI, 0.85–0.98), respectively] than those with mucinous or clear cell subtypes [AUC = 0.65 (95% CI, 0.51–0.79) and 0.77 (95% CI, 0.62–0.92); Supplementary Figs. S13A–S13D and S14].

In the setting of a patient with a mass suspicious for ovarian cancer, the clinical pathway generally involves referral to a gynecologic oncologist for surgical staging. A noninvasive test could help inform referral decisions, plan the extent of surgical resection, or even avoid resection in young or frail patients. In these scenarios, high sensitivity is critical to tailoring response, and a moderate specificity may be acceptable, because of the importance of not missing patients with ovarian cancer while at the same time avoiding anxiety and unnecessary surgeries for patients who would not need further follow-up (34). Consistent with this approach, at 80% specificity in the discovery cohort, we distinguished 95% of patients with HGSOC from patients with benign masses. Evaluation of the locked model in the validation cohort resulted in an AUC of 0.81 (95% CI, 0.72–0.91; Supplementary Figs. S13C, S13D, and S15A–S15C), and at the score threshold achieving 80% specificity in the discovery cohort, we maintained a relatively high sensitivity, identifying 81% of patients with HGSOC at a specificity of 82% (Supplementary Table S6). In this setting, the DELFI-Pro scores appeared related to overall tumor burden, as we observed a positive correlation between the DELFI-Pro scores and the sum of reported lesion diameters where these data were available (R = 0.65; P = 0.03; Supplementary Fig. S16A and S16B).

Simulating the Performance of DELFI-Pro at the Population Scale

To examine how DELFI-Pro would perform on a population scale for ovarian cancer screening, we used Monte Carlo simulations to evaluate a theoretical screening population of 100,000 women (Fig. 4A). We compared DELFI-Pro with two other proposed clinical tests: CA-125 at a cut-off of 30 U/mL and HE4 at a cut-off of 70 pmol/L. For both CA-125 and HE4, we evaluated the cut-point both using performance estimates reported in the literature (35, 36) as well as those observed in our cohort, whereas for DELFI-Pro, we evaluated performance at the cut-point achieving greater >99% specificity in our analyses. We blended sensitivity estimates according to the stage distribution in the UKCTOCS trial (36) and modeled the degree of uncertainty of sensitivities and specificities of these tests in our theoretical population based on a 0.0037 prevalence of ovarian cancer (Fig. 4B; refs. 33, 37). Monte Carlo simulations from these predicted probability distributions demonstrated that the PPV for DELFI-Pro was high (median 23.6%, 95% CI, 8.73%–68.5%), whereas all other modalities had a median PPV estimate of 9.17% or lower (Fig. 4C). Given the low prevalence of ovarian cancer and the risks of exploratory surgery, a PPV greater than 10% (38, 39) is needed to justify population-wide screening for ovarian cancer in a way that balances the benefits of early detection against potential harm from unnecessary surgical procedures in healthy women misdiagnosed with cancer. In addition, the cut-off chosen for DELFI-Pro at >99% specificity (no false positives in either the discovery or validation cohort) led to a predicted low false positive rate (FPR; median 0.95%, 95% CI, 0.14%–3.1%) as compared with the other four scenarios simulated (range of FPR medians, 3.12%–20.60%; Fig. 4D). These analyses suggest that an accessible, high-adherence, sensitive, and specific assay like DELFI-Pro could enable population-wide ovarian cancer screening.

Figure 4.

Modelling the implementation of DELFI for ovarian cancer screening. A, The proposed approach integrates the use of cfDNA fragmentation and protein analyses from a blood draw. Women with a positive result would undergo a transvaginal ultrasound and if positive would subsequently have a diagnostic cancer workup. A negative result at any step in this continuum would remove patients from subsequent steps and lead to annual screening. B, Modeling a theoretical population of 100,000 women based on existing performances for CA-125 and HE4, as well those observed for DELFI-Pro. Predictive distributions for the (C) PPV and (D) FPR highlight the potential benefit of implementing DELFI-Pro as compared with existing biomarkers. JHU, Johns Hopkins University; TVUS, transvaginal ultrasound.

Figure 4.

Modelling the implementation of DELFI for ovarian cancer screening. A, The proposed approach integrates the use of cfDNA fragmentation and protein analyses from a blood draw. Women with a positive result would undergo a transvaginal ultrasound and if positive would subsequently have a diagnostic cancer workup. A negative result at any step in this continuum would remove patients from subsequent steps and lead to annual screening. B, Modeling a theoretical population of 100,000 women based on existing performances for CA-125 and HE4, as well those observed for DELFI-Pro. Predictive distributions for the (C) PPV and (D) FPR highlight the potential benefit of implementing DELFI-Pro as compared with existing biomarkers. JHU, Johns Hopkins University; TVUS, transvaginal ultrasound.

Close modal

There is a clinical unmet need for an approach that improves detection of early-stage ovarian cancer and provides guidance toward differentiating between benign or malignant ovarian masses. In this study, we demonstrate that cfDNA fragmentomes combined with existing protein biomarkers can noninvasively detect early-stage ovarian cancer and distinguish these lesions from benign ovarian masses.

The performance of our multianalyte and multifeature approach for detection of ovarian cancer was high and suggests that the combination of cfDNA and protein measurements is complementary, providing more information than either alone. This is particularly important for early-stage disease, especially for high-grade serous cancer, in which intervention is thought to be most useful (32, 40). The validation of this approach in a fully independent cohort suggests that the method is robust and likely generalizable across different populations.

The survival benefit of current methods used in screening for ovarian cancer remains unclear (6, 41). However, the development of a new classifier like DELFI-Pro that provides high performance at higher specificity than obtained in previous studies opens a new avenue for detection of individuals who may benefit most from subsequent diagnostic workup or intervention. Our population-scale simulations suggest that the improved performance of DELFI-Pro in comparison with either CA-125 or HE4 alone would increase the PPV and decrease the predicted FPR, thereby improving the overall impact and benefit-to-risk ratio of this approach in a screening setting, especially when the disease prevalence is low. Recent literature suggests that early diagnoses of cancers reduce treatment costs (42), thereby potentially decreasing overall societal health care costs while improving outcomes.

Although the study was performed in a sizeable EU diagnostic cohort, it was subsequently validated in a modestly sized but fully external U.S. cohort. Additional and larger prospective studies, including one already underway (NCT04971421), will be needed to validate this approach for clinical use. In other cancer types, we have previously associated DELFI fragmentation scores with survival outcome data (21), and future efforts are needed to evaluate the prognostic potential of the DELFI-Pro score in ovarian cancer. The performance for detecting some subtypes of ovarian cancer (i.e., clear cell or mucinous) was lower, and inclusion of other protein biomarkers in our assay that are tailored to these cell types of origin may increase performance in the future. Assays using a larger number of proteins have shown promising initial results (43) but are not yet broadly available for research or clinical use. The use of cfDNA fragmentation to distinguish between different cancer subtypes (21) may be feasible for differentiating among ovarian cancer subtypes and enabling personalized therapeutic approaches.

Ultimately, evaluation of survival outcomes will be important to demonstrate the benefit of population-scale screening with this approach, as stage shift alone may not result in an effective alternative measure of survival for ovarian cancer (6, 44). The use of both cfDNA and protein measurements may initially seem to be complex, but both types of analytes can be assessed from the same sample of blood, and optimized methods suggest that this combined approach would be cost-efficient and accessible. Overall, this study provides a new accessible approach for early detection of ovarian cancer that may overcome current challenges for ovarian cancer screening and reduce the morbidity and mortality of this disease.

Study Population and Design

Liquid biopsies from 591 healthy individuals or individuals with ovarian cancer or benign adnexal masses were prospectively collected at University of Pennsylvania (Penn BioTrust Collection; RRID: SCR_022387) from previously reported diagnostic or screening studies at the Netherlands Cancer Institute (trial NL58253.031.16; refs. 9, 29), the Danish Endoscopy III trial (21, 23, 45), or the Netherlands COCOS trial (Netherlands trial register ID NTR1829; refs. 21, 23) or through a commercial provider of biobank research specimens (BioIVT). All samples were obtained under Institutional Review Board–approved protocols with written informed consent from all participants for research use at participating institutions, and the studies were performed according to the Declaration of Helsinki. Liquid biopsies from healthy individuals were obtained at the time of routine clinical appointments. Individuals were considered healthy if they had no prior history of cancer. Individuals with symptoms indicating clinical follow-up or at high risk for development of ovarian cancer were assessed using imaging of the pelvic region to identify ovarian masses. Depending on size and estimated risk of malignancy, patients received either an exploratory laparotomy with frozen section (and staging when confirmed malignant or debulking in the case of an unexpected higher stage) or, if the lesion was expected to be benign, laparoscopic cystectomy/adnectomy. Liquid biopsies from patients with ovarian cancer or benign adnexal masses were obtained at the time of diagnosis prior to surgical resection or therapeutic intervention. Of the total 591 women included in the study, 204 were healthy, 253 had a benign adnexal mass, and 134 had ovarian cancer. All stages of ovarian cancer were represented in the study population, including 46, 31, 41, and 8 individuals with stages I, II, III, and IV of cancer, respectively (n = 8, stage unknown). The cancer cohort was comprised largely of HGSOC (n = 55) with a subset of individuals having LGSOC or serous (n = 12), clear cell (n = 13), mucinous (n = 17), endometrioid (n = 19), or another (n = 18) histopathologic ovarian cancer diagnosis. Clinical data were completely de-identified for all individuals included in this study and are listed in Supplementary Table S1.

This study was designed to provide proof-of-concept for noninvasive detection of ovarian cancer using a genome-wide fragmentome-based approach. For cfDNA analyses, all liquid biopsies were processed to separate blood plasma from which cfDNA was extracted and processed to create genomic libraries for WGS at ∼2× coverage. The study population was subset to assess a discovery cohort to train and cross-validate a machine learning model for ovarian cancer detection, followed by application of the trained model to the subset of the population remaining as the validation cohort. Prediction of ovarian cancer was assessed in two clinical scenarios: (i) a screening model (ovarian cancer vs. no ovarian lesion) and (ii) a diagnostic model (ovarian cancer vs. benign mass). The discovery cohort was defined to include (i) healthy individuals with no history of prior cancer and patients with ovarian cancer or (ii) individuals with a benign adnexal mass and patients with ovarian cancer for the screening and diagnostic models, respectively (n = 479 discovery, n = 112 validation).

Liquid Biopsy Collection and Extraction of cfDNA

We collected venous peripheral blood in K2-EDTA (ethylenediaminetetraacidic acid) or Streck tubes and, within 2 hours, centrifuged the tubes at 800 × g at 4°C for 10 minutes. Then the plasma fraction was transferred to new tubes and spun at 18,000 × g for 10 minutes at room temperature to pellet remaining cellular debris. EDTA tubes from the Danish Endoscopy III trial were centrifuged at low speed (3,000 g) for 10 minutes within 2 hours from blood collection. The plasma portion from the first spin was spun a second time for 10 minutes. Plasma was subsequently aliquoted and stored at −80°C. cfDNA was isolated from ∼4 to 5 mL of plasma using QIAamp Circulating Nucleic Acid Kit (Qiagen GmbH). Extracted cfDNA was eluted in 52 μL into LoBind tubes (Eppendorf AG) and quantified using the Bioanalyzer 2100 (Agilent Technologies).

Genomic Library Construction

cfDNA libraries for next-generation WGS were prepared with 15 ng of cfDNA when available or the entire purified amount when less than 15 ng (Supplementary Table S2; refs. 2123). The genomic libraries were prepared using NEBNext DNA Library Prep Kit for Illumina (New England Biolabs) with four main modifications to the manufacturer’s guidelines: (i) the library purification steps followed the on-bead AMPure XP (Beckman Coulter) approach to minimize sample loss during elution and tube transfer steps; (ii) NEBNext End Repair, A-tailing, and adapter ligation enzyme and buffer volumes were adjusted as appropriate to accommodate on-bead AMPure XP purification; (iii) Illumina dual-index adapters were used in the ligation reaction; and (iv) cfDNA libraries were amplified with Phusion Hot Start Polymerase (Thermo Scientific). All samples underwent four cycles of PCR amplification after the DNA ligation step.

Both genomic sequencing and protein measurements were performed in batches that included samples from individuals with or without cancer, including from other studies, to reduce the possibility that differences between patients with or without cancer were not due to batch variability (Supplementary Table S2).

WGS and Alignment

Whole-genome libraries were sequenced using 100-bp paired-end runs (200 cycles) on the Illumina HiSeq2500 platform at 1 to 2× coverage per genome (2123). Before alignment, adapter sequences were filtered from reads using FASTP software (46). Sequence reads were then aligned to the hg19 human reference genome with Bowtie2 (47), duplicate reads were removed using Sambamba (48), and each aligned pair was converted to a genomic interval representing the sequenced DNA fragment using BEDTools (49). Reads with a MAPQ (Mapping Quality) score of less than 30 or that overlapped the Duke Excluded Regions blacklist (https://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeMapability) were excluded. To construct fragmentation profiles from low-coverage WGS that reflected large-scale epigenetic differences in fragmentation across the genome, we partitioned the hg19 reference genome into nonoverlapping 5-Mb bins. Bins with mean GC (guanine and cytosine) base content <0.3 or mean mappability <0.9 were excluded, leaving 473 bins spanning approximately 2.4 Gb of the genome. A fragment-level GC correction was performed independently for short (<150 bp) and long (≥150 bp) cfDNA fragments using an external reference panel of individuals without cancer to generate a target distribution, as previously described (21, 22).

Genome-Wide Fragmentome Analyses

Fragmentation features were calculated as the ratio of short to long fragments in 473 nonoverlapping 5-Mb bins across the genome and as z-scores representing arm gains/losses for autosomal chromosome arms, as described in our previous publications (21, 22).

Analyses of Publicly Available TCGA Data

Copy-number data from the ovarian cancer cohort in TCGA [ovarian cancer n = 597] were retrieved using the package RTCGA v1.16.0 and were analyzed to determine the frequency of copy-number gains and losses in the 473 5-Mb bins for this cohort (21, 22). A somatic copy-number alteration threshold was used to call gains and losses in the ovarian cohorts (21, 50).

Proteins

Protein analyses were conducted on matched serum of the same patient or plasma from the same aliquot used for cfDNA isolation. The proteins CA-125 (U/mL) and HE4 (pmol/L) (Roche Elecsys II) were measured using a Roche Cobas E602 immunoassay analyzer in EDTA- or Streck-collected plasma (n = 435) and serum (n = 70). Evaluation of protein measurements showed high correlation between current analyses and replicate measurements performed at other institutions (Supplementary Fig. S17A and S17B). A subset of plasma from trial NL58253.031.16 was not available for protein analyses but had been previously evaluated for CA-125 (serum) and HE4 (plasma; n = 86). Assessment of proteins from multiple centers were batched by biospecimen type, source of collection, prior CA-125 data availability, as well as cancer stage or noncancer status and contained a set of technical replicates across batches. CA-125 and HE4 were measured at the Johns Hopkins Clinical Chemistry Research Laboratory, Department of Pathology, Division of Clinical Chemistry.

Machine Learning and Cross-Validation

Two machine learning models were developed to predict the presence of ovarian cancer in (i) a screening setting, and (ii) a diagnostic setting. Both models used PLR and features included fragmentation profiles, chromosomal arm–level changes, as well as the protein biomarkers CA-125 and HE4. The models were trained and cross-validated using data from (i) individuals in the subset of the discovery group with ovarian cancer or without any known ovarian lesions for the screening model, and (ii) individuals in the subset of the discovery group with ovarian cancer or benign adnexal masses for the diagnostic model. The principal components of the ratios representing greater than 90% of variance and the z-scores (21, 22), along with levels of the protein biomarkers CA-125 and HE4, were used to train machine learning models. Training was performed with 10 repeats of 5-fold cross-validation, generating a DELFI-Pro score for every individual in the discovery cohort, which was the average over 10 cross-validation repeats. For the validation cohort, DELFI-Pro scores were generated using the locked models. Performance of the models was assessed using ROC analyses and at fixed score thresholds for set specificities in the discovery cohort.

Association of Clinical Covariates and the DELFI Score

Potential associations between clinical covariates and the DELFI-Pro score were assessed with Spearman rank correlation coefficient (continuous variables), Wilcoxon signed-rank test (two categorical variables), and Kruskal–Wallis one-way ANOVA (>2 categorical variables).

Modeling of DELFI Performance in Screening and Diagnostic Settings

Monte Carlo simulations were used to compare the DELFI-Pro approach with other proposed biomarkers (CA-125 and HE4) in a theoretical surveillance population. For CA-125, we used published sensitivities and specificities for CA-125 at 30 U/mL from the UKCTOCS trial (36) and estimated sensitivity and specificity in our cohort using the same threshold. For HE4, we used published sensitivities and specificities for HE4 at 70 pmol/L from (35) and estimated sensitivity and specificity in our cohort using the same threshold. For DELFI-Pro, we used sensitivity based on the score threshold yielding >99% specificity in the discovery cohort. We blended by-stage sensitivity estimates according to the stage distribution of cancers in UKCTOCS (36) and drew a 95% binomial CI around each sensitivity and specificity estimate. As noninvasive blood-based tests have a reported adherence of more than 75% (51, 52), we assumed a point estimate of 75% adherence to a blood-based biomarker test, with a 95% CI of 60% to 90%.

The R package epiR was used to construct prior predictive probability distributions (β distributions) from these CIs (R package version 2.47, epiR; RRID: SCR_021673) for sensitivity, specificity, and adherence. We estimated the prevalence of ovarian cancer as 0.0037 using Surveillance, Epidemiology, and End Results (37) and U.S. Census data (53) as follows: In 2020, there were 236,511 women with ovarian cancer in the United States, and 2022 census data indicated 63,757,324 women, age 50+. For a single Monte Carlo simulation for DELFI-Pro, we

  • i.

    sampled the probability of adherence (η) from the prior predictive distribution,

  • ii.

    simulated the number of 100,000 individuals (S) who participated in screening [S ∼ binomial (η, 100,000)],

  • iii.

    sampled the prevalence of ovarian cancer [θ ∼ β (236511, 63520813)]

  • iv.

    simulated ovarian cancer cases [P ∼ binomial (θ, S)] and computed the number of individuals without cancer (N = SP),

  • v.

    sampled the sensitivity (se) and specificity (sp) from the corresponding prior predictive distributions, and

  • vi.

    sampled the true positives [TP ∼ binomial (P, se)] and false positives [FP ∼ binomial (N, 1 − sp)].

Given TP and FP, we calculated the PPV as (TP)/(TP + FP) and the FPR as FP/N. We repeated the above simulation 1,000 times, obtaining a distribution of PPV and FPR. Using parameters for sensitivity, specificity, and adherence for the CA-125 and HE4 scenarios, we repeated the same Monte Carlo analysis to allow comparisons between the different proposed screening methodologies.

Bioinformatic and Statistical Software

All statistical analyses were performed using R version 4.1.2. Trimming of adapter sequences was performed using FASTP (0.20.0). We used Bowtie2 (2.3.0) to align paired-end reads to the hg19 reference genome. PCR duplicates were removed using Sambamba (0.6.8), and the remaining aligned read pairs were converted to a bed format using BEDTools (2.29.0). We used the R package data.table (1.12.8) for manipulation of tabular data and binning fragments in 5-Mb windows across the genome. The R package Caret (6.0.84) was used to implement the classification by PLR and resampling.

Data Availability

The code and data needed for generating figures and results are available at https://github.com/cancer-genomics/delfipro2024. The code needed to run the DELFI pipeline and generate features used in modeling is available at https://github.com/cancer-genomics/delfi3. Sequence data and clinical variables generated in this study have been deposited at the database of EU Genome–Phenome Archive under accession codes EGAS00001005340 and EGAS50000000484.

J.E. Medina reports a patent for 63/574,641 pending, a patent for 63/558,893 pending, and a patent for 63/621,749 pending. A.V. Annapragada reports a patent for 63/532,642 pending and licensed to DELFI Diagnostics, a patent for PCT/US2023/078857 pending, licensed, and with royalties paid from DELFI Diagnostics, a patent for 63/574,641 pending, and a patent for 63/558,893 pending. D. Mathios reports other support from Belay Diagnostics outside the submitted work; in addition, D. Mathios has a patent for “Detection of lung cancer with cfDNA fragmentation” licensed to DELFI Diagnostics and a patent for “Detection of brain cancer with cfDNA fragmentation and repeat landscapes” pending; D. Mathios’ wife is founding member of DELFI Diagnostics and has shares from the company. Z.H. Foda reports personal fees from DELFI Diagnostics outside the submitted work; in addition, Z.H. Foda has a patent for “Detecting liver cancer using cell-free DNA fragmentation” issued, licensed, and with royalties paid from DELFI Diagnostics. D.C. Bruhm reports patent applications related to early detection of cancer pending, issued, licensed, and with royalties paid from DELFI Diagnostics. N.A. Vulpescu reports grants from National Institute of General Medical Sciences during the conduct of the study. J.V. Canzoniero reports nonfinancial support from Foundation Medicine and personal fees from Illumina, AstraZeneca, MJH Life Sciences, and Conexient outside the submitted work; in addition, J.V. Canzoniero has a patent for 63/548,318 pending. S. Cristiano reports a patent for US20200131571A1 licensed and a patent for US20240060141A1 licensed. V. Adleff reports grants and personal fees from DELFI Diagnostics during the conduct of the study; in addition V. Adleff has a patent for 62/673,516 pending and licensed to DELFI Diagnostics and is an inventor on patent applications submitted by Johns Hopkins University related to cancer genomic analyses and cfDNA for cancer detection that have been licensed to one or more entities, including DELFI Diagnostics and LabCorp. Under the terms of these license agreements, the university and inventors are entitled to fees and royalty distributions. S.B. Baylin reports personal fees from MSP (Methylation Specific PCR) outside the submitted work; in addition, S.B. Baylin has a patent for US-20240156850-A1 with royalties paid, a patent for US-20210161928-A1 with royalties paid, a patent for US-20200239962-A1 with royalties paid, a patent for US-20150031022-A1 with royalties paid, and a patent for US-10363264-B2 with royalties paid. M.F. Press reports personal fees from Zymeworks Inc., Novartis Pharmaceuticals, AstraZenecca, Eli Lilly and Company, Merck & Co., Curio SciencePhysicians’ Education Resource, LLC (PER, and Medscape and other support from Jazz Pharmaceuticals and TORL Biotherapeutics, LLC outside the submitted work. D.J. Slamon reports nonfinancial support and other support from BioMarin, grants, nonfinancial support, and other support from Pfizer and Novartis, personal fees from Eli Lilly, and other support from Amgen, Seattle Genetics, 1200 Pharma, and TORL BioTherapeutics outside the submitted work. G.E. Konecny reports Speakers’ Bureau–AstraZeneca; Merck; GSK; Abbvie/Immunogen Research Funding–Lilly (Inst); Merck (Inst); Consulting–GOG Foundation; Travel, Accommodations, and Expenses–TORL Biotherapeutics; and Expert Testimony–Foundation Medicine. G.A. Meijer reports grants from Stand Up to Cancer–Dutch Cancer Society International Translational Cancer Research Dream Team Grant (SU2C-AACR-DT1415) and nonfinancial support from DELFI Diagnostics during the conduct of the study and nonfinancial support from DELFI Diagnostics outside the submitted work. C.L. Andersen reports grants from the Danish Council for Strategic Research during the conduct of the study and nonfinancial support from Natera Inc., and C2i Genomics outside the submitted work. R. Drapkin reports personal fees from Repare Therapeutics and Light Horse Therapeutics outside the submitted work; in addition, R. Drapkin has a patent for “Methods for detecting ovarian cancer” issued to Dana-Farber Cancer Institute. R.B. Scharpf reports grants and personal fees from DELFI Diagnostics outside the submitted work and is under a license agreement between DELFI Diagnostics and the Johns Hopkins University; the university and R.B. Scharpf are entitled to royalty distributions related to technology described in the study. Additionally, the university owns equity in DELFI Diagnostics. R.B. Scharpf is a founder of and holds equity in DELFI Diagnostics. He also serves as DELFI’s Head of Data Science. This arrangement has been reviewed and approved by the Johns Hopkins University in accordance with its conflict-of-interest policies. J. Phallen reports other support from DELFI Diagnostics during the conduct of the study; in addition, J. Phallen has a patent for 63/574,641 pending and a patent for 62/673,516 pending and licensed to DELFI Diagnostics. V.E. Velculescu reports grants, personal fees, and other support from DELFI Diagnostics during the conduct of the study and personal fees and other support from Viron Therapeutics and Epitope outside the submitted work; in addition, V.E. Velculescu has a patent for 63/574,641 pending and a patent for 62/673,516 pending and licensed to DELFI Diagnostics. V.E. Velculescu is a founder of DELFI Diagnostics, serves on the board of directors, and owns DELFI Diagnostics stock, which is subject to certain restrictions under university policy. In addition, Johns Hopkins University owns equity in DELFI Diagnostics. V.E. Velculescu divested his equity in Personal Genome Diagnostics to LabCorp in February 2022. V.E. Velculescu is an inventor on patent applications submitted by Johns Hopkins University related to cancer genomic analyses and cfDNA for cancer detection that have been licensed to one or more entities, including DELFI Diagnostics, LabCorp, QIAGEN, Sysmex, Agios, Genzyme, Esoterix, Ventana, and ManaT Bio. Under the terms of these license agreements, the university and inventors are entitled to fees and royalty distributions. V.E. Velculescu is an advisor to Viron Therapeutics and Epitope. These arrangements have been reviewed and approved by the Johns Hopkins University in accordance with its conflict-of-interest policies. No disclosures were reported by the other authors.

J.E. Medina: Conceptualization, data curation, formal analysis, validation, investigation, visualization, methodology, writing–original draft, writing–review and editing. A.V. Annapragada: Conceptualization, data curation, software, formal analysis, validation, investigation, visualization, methodology, writing–original draft, writing–review and editing. P. Lof: Data curation, writing–review and editing. S. Short: Data curation, writing–review and editing. A.L. Bartolomucci: Data curation, writing–review and editing. D. Mathios: Investigation, writing–review and editing. S. Koul: Software, writing–review and editing. N. Niknafs: Investigation, writing–review and editing. M. Noë: Data curation, investigation, writing–review and editing. Z.H. Foda: Investigation, writing–review and editing. D.C. Bruhm: Investigation, writing–review and editing. C. Hruban: Visualization, writing–review and editing. N.A. Vulpescu: Investigation, writing–review and editing. E. Jung: Data curation, writing–review and editing. R. Dua: Data curation, writing–review and editing. J.V. Canzoniero: Investigation, writing–review and editing. S. Cristiano: Investigation, writing–review and editing. V. Adleff: Investigation, writing–review and editing. H. Symecko: Data curation, writing–review and editing. D. van den Broek: Investigation, writing–review and editing. L.J. Sokoll: Data curation, writing–review and editing. S.B. Baylin: Investigation, writing–review and editing. M.F. Press: Investigation, writing–review and editing. D.J. Slamon: Investigation, writing–review and editing. G.E. Konecny: Investigation, writing–review and editing. C. Therkildsen: Investigation, writing–review and editing. B. Carvalho: Investigation, writing–review and editing. G.A. Meijer: Investigation, writing–review and editing. C.L. Andersen: Investigation, writing–review and editing. S.M. Domchek: Investigation, writing–review and editing. R. Drapkin: Investigation, writing–review and editing. R.B. Scharpf: Conceptualization, resources, data curation, software, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. J. Phallen: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. C.A.R. Lok: Conceptualization, resources, supervision, funding acquisition, project administration, writing–review and editing. V.E. Velculescu: Conceptualization, resources, data curation, software, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing.

We thank the members of our laboratories for critical review of the manuscript. This work was supported in part by the Dr. Miriam and Sheldon G. Adelson Medical Research Foundation (to V.E. Velculescu, J. Phallen, and R.B. Scharpf); the Gray Foundation (to V.E. Velculescu and J. Phallen); SU2C In-Time Lung Cancer Interception Dream Team Grant (to V.E. Velculescu and J. Phallen); Stand Up to Cancer-Dutch Cancer Society International Translational Cancer Research Dream Team Grant (SU2C-AACR-DT1415; to V.E. Velculescu); DoD Omics Consortium W81XWH-22-1-0852 (to R. Drapkin); the Honorable Tina Brozman Foundation (to V.E. Velculescu and J. Phallen); the Commonwealth Foundation (to V.E. Velculescu, V. Adleff, and R.B. Scharpf); the Mark Foundation for Cancer Research (to D. Mathios); the Cole Foundation (to V.E. Velculescu); the Claneil Foundation (to R. Drapkin), the Canary Foundation (to R. Drapkin), the Mike and Patti Hennessy Foundation (to R. Drapkin), the Carl H. Goldsmith Ovarian Cancer Translational Research Fund (to R. Drapkin), the Monica K. Young Foundation (to R. Drapkin); a research grant from DELFI Diagnostics (to V.E. Velculescu and R.B. Scharpf); the Stichting Hanarth Fonds (to C.A.R. Lok); the Novo Nordisk Foundation (NNF22OC0074415; C.L. Andersen); the Danish Cancer Society (R257-A14700; C.L. Andersen), and the U.S. NIH grants CA121113 (to V.E. Velculescu), CA006973 (to V.E. Velculescu), CA233259 (to V.E. Velculescu), CA062924 (to V.E. Velculescu and R.B. Scharpf), CA271896 (to V.E. Velculescu), T32GM148383 (to N.A. Vulpescu), 1T32GM136577 (to A.V. Annapragada), F30CA294612 (to A.V. Annapragada), and CA228991 (to R. Drapkin). Stand Up To Cancer is a program of the Entertainment Industry Foundation administered by the American Association for Cancer Research. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Note: Supplementary data for this article are available at Cancer Discovery Online (http://cancerdiscovery.aacrjournals.org/).

1.
Sung
H
,
Ferlay
J
,
Siegel
RL
,
Laversanne
M
,
Soerjomataram
I
,
Jemal
A
, et al
.
Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries
.
CA Cancer J Clin
2021
;
71
:
209
49
.
2.
Siegel
RL
,
Giaquinto
AN
,
Jemal
A
.
Cancer statistics, 2024
.
CA Cancer J Clin
2024
;
74
:
12
49
.
3.
American Cancer Society
.
Ovarian cancer survival rates. Ovarian cancer early detection, diagnosis, and staging
[
Internet
].
[cited 2024 Mar 12]. Available from:
https://www.cancer.org/cancer/types/ovarian-cancer/detection-diagnosis-staging/survival-rates.html.
4.
Ruhl
J
,
Callaghan
C
,
Schussler
N
.
Summary stage 2018: codes and coding instructions
.
Bethesda (MD)
:
National Cancer Institute
;
2023
.
5.
Prorok
PC
,
Andriole
GL
,
Bresalier
RS
,
Buys
SS
,
Chia
D
,
Crawford
ED
, et al
.
Design of the prostate, lung, colorectal and ovarian (PLCO) cancer screening trial
.
Control Clin Trials
2000
;
21
:
273S
309S
.
6.
Menon
U
,
Gentry-Maharaj
A
,
Burnell
M
,
Singh
N
,
Ryan
A
,
Karpinskyj
C
, et al
.
Ovarian cancer population screening and mortality after long-term follow-up in the UK Collaborative Trial of Ovarian Cancer Screening (UKCTOCS): a randomised controlled trial
.
Lancet
2021
;
397
:
2182
93
.
7.
Han
CY
,
Lu
KH
,
Corrigan
G
,
Perez
A
,
Kohring
SD
,
Celestino
J
, et al
.
Normal risk ovarian screening study: 21-year update
.
J Clin Oncol
2024
;
42
:
1102
9
.
8.
Kim
J
,
Park
EY
,
Kim
O
,
Schilder
JM
,
Coffey
DM
,
Cho
C-H
, et al
.
Cell origins of high-grade serous ovarian cancer
.
Cancers (Basel)
2018
;
10
:
433
.
9.
Lof
P
,
van de Vrie
R
,
Korse
CM
,
van Gent
MDJM
,
Mom
CH
,
Rosier-van Dunné
FMF
, et al
.
Can serum human epididymis protein 4 (HE4) support the decision to refer a patient with an ovarian mass to an oncology hospital?
Gynecol Oncol
2022
;
166
:
284
91
.
10.
Schummer
M
,
Ng
WV
,
Bumgarner
RE
,
Nelson
PS
,
Schummer
B
,
Bednarski
DW
, et al
.
Comparative hybridization of an array of 21,500 ovarian cDNAs for the discovery of genes overexpressed in ovarian carcinomas
.
Gene
1999
;
238
:
375
85
.
11.
Chen
F
,
Shen
J
,
Wang
J
,
Cai
P
,
Huang
Y
.
Clinical analysis of four serum tumor markers in 458 patients with ovarian tumors: diagnostic value of the combined use of HE4, CA125, CA19-9, and CEA in ovarian tumors
.
Cancer Manag Res
2018
;
10
:
1313
8
.
12.
Reilly
G
,
Bullock
RG
,
Greenwood
J
,
Ure
DR
,
Stewart
E
,
Davidoff
P
, et al
.
Analytical validation of a deep neural network algorithm for the detection of ovarian cancer
.
JCO Clin Cancer Inform
2022
;
6
:
e2100192
.
13.
Jacobs
I
,
Oram
D
,
Fairbanks
J
,
Turner
J
,
Frost
C
,
Grudzinskas
JG
.
A risk of malignancy index incorporating CA 125, ultrasound and menopausal status for the accurate preoperative diagnosis of ovarian cancer
.
Br J Obstet Gynaecol
1990
;
97
:
922
9
.
14.
Timmerman
D
,
Valentin
L
,
Bourne
TH
,
Collins
WP
,
Verrelst
H
,
Vergote
I
, et al
.
Terms, definitions and measurements to describe the sonographic features of adnexal tumors: a consensus opinion from the International Ovarian Tumor Analysis (IOTA) Group
.
Ultrasound Obstet Gynecol
2000
;
16
:
500
5
.
15.
Phallen
J
,
Sausen
M
,
Adleff
V
,
Leal
A
,
Hruban
C
,
White
J
, et al
.
Direct detection of early-stage cancers using circulating tumor DNA
.
Sci Transl Med
2017
;
9
:
eaan2415
.
16.
Lennon
AM
,
Buchanan
AH
,
Kinde
I
,
Warren
A
,
Honushefsky
A
,
Cohain
AT
, et al
.
Feasibility of blood testing combined with PET-CT to screen for cancer and guide intervention
.
Science
2020
;
369
:
eabb9601
.
17.
Klein
EA
,
Richards
D
,
Cohn
A
,
Tummala
M
,
Lapham
R
,
Cosgrove
D
, et al
.
Clinical validation of a targeted methylation-based multi-cancer early detection test using an independent validation set
.
Ann Oncol
2021
;
32
:
1167
77
.
18.
Taylor
MS
,
Wu
C
,
Fridy
PC
,
Zhang
SJ
,
Senussi
Y
,
Wolters
JC
, et al
.
Ultrasensitive detection of circulating LINE-1 ORF1p as a specific multicancer biomarker
.
Cancer Discov
2023
;
13
:
2532
47
.
19.
Sato
S
,
Gillette
M
,
de Santiago
PR
,
Kuhn
E
,
Burgess
M
,
Doucette
K
, et al
.
LINE-1 ORF1p as a candidate biomarker in high grade serous ovarian carcinoma
.
Sci Rep
2023
;
13
:
1537
.
20.
Leal
A
,
van Grieken
NCT
,
Palsgrove
DN
,
Phallen
J
,
Medina
JE
,
Hruban
C
, et al
.
White blood cell and cell-free DNA analyses for detection of residual disease in gastric cancer
.
Nat Commun
2020
;
11
:
525
.
21.
Mathios
D
,
Johansen
JS
,
Cristiano
S
,
Medina
JE
,
Phallen
J
,
Larsen
KR
, et al
.
Detection and characterization of lung cancer using cell-free DNA fragmentomes
.
Nat Commun
2021
;
12
:
5060
.
22.
Foda
ZH
,
Annapragada
AV
,
Boyapati
K
,
Bruhm
DC
,
Vulpescu
NA
,
Medina
JE
, et al
.
Detecting liver cancer using cell-free DNA fragmentomes
.
Cancer Discov
2023
;
13
:
616
31
.
23.
Cristiano
S
,
Leal
A
,
Phallen
J
,
Fiksel
J
,
Adleff
V
,
Bruhm
DC
, et al
.
Genome-wide cell-free DNA fragmentation in patients with cancer
.
Nature
2019
;
570
:
385
9
.
24.
Bruhm
DC
,
Mathios
D
,
Foda
ZH
,
Annapragada
AV
,
Medina
JE
,
Adleff
V
, et al
.
Single-molecule genome-wide mutation profiles of cell-free DNA for non-invasive detection of cancer
.
Nat Genet
2023
;
55
:
1301
10
.
25.
Annapragada
AV
,
Niknafs
N
,
White
JR
,
Bruhm
DC
,
Cherry
C
,
Medina
JE
, et al
.
Genome-wide repeat landscapes in cancer and cell-free DNA
.
Sci Transl Med
2024
;
16
:
eadj9283
.
26.
Noë
M
,
Mathios
D
,
Annapragada
AV
,
Koul
S
,
Foda
ZH
,
Medina
JE
, et al
.
DNA methylation and gene expression as determinants of genome-wide cell-free DNA fragmentation
.
Nat Commun
2024
;
15
:
6690
.
27.
Medina
JE
,
Dracopoli
NC
,
Bach
PB
,
Lau
A
,
Scharpf
RB
,
Meijer
GA
, et al
.
Cell-free DNA approaches for cancer early detection and interception
.
J Immunother Cancer
2023
;
11
:
e006013
.
28.
van’t Erve
I
,
Medina
JE
,
Leal
A
,
Papp
E
,
Phallen
J
,
Adleff
V
, et al
.
Metastatic colorectal cancer treatment response evaluation by ultra-deep sequencing of cell-free DNA and matched white blood cells
.
Clin Cancer Res
2023
;
29
:
899
909
.
29.
Gaillard
DHK
,
Lof
P
,
Sistermans
EA
,
Mokveld
T
,
Horlings
HM
,
Mom
CH
, et al
.
Evaluating the effectiveness of pre-operative diagnosis of ovarian cancer using minimally invasive liquid biopsies by combining serum human epididymis protein 4 and cell-free DNA in patients with an ovarian mass
.
Int J Gynecol Cancer
2024
;
34
:
713
21
.
30.
Cancer Genome Atlas Research Network
.
Integrated genomic analyses of ovarian carcinoma
.
Nature
2011
;
474
:
609
15
.
31.
Macintyre
G
,
Goranova
TE
,
De Silva
D
,
Ennis
D
,
Piskorz
AM
,
Eldridge
M
, et al
.
Copy number signatures and mutational processes in ovarian carcinoma
.
Nat Genet
2018
;
50
:
1262
70
.
32.
Labidi-Galy
SI
,
Papp
E
,
Hallberg
D
,
Niknafs
N
,
Adleff
V
,
Noe
M
, et al
.
High grade serous ovarian carcinomas originate in the fallopian tube
.
Nat Commun
2017
;
8
:
1093
.
33.
SEER
.
SEER cancer stat facts: ovarian cancer
[
Internet
].
Bethesda (MD)
:
National Cancer Institute
.
[cited 2024 Mar 16]. Available from:
https://seer.cancer.gov/statfacts/html/ovary.html.
34.
Lof
P
,
Engelhardt
EG
,
van Gent
MDJM
,
Mom
CH
,
Rosier-van Dunné
FMF
,
van Baal
WM
, et al
.
Psychological impact of referral to an oncology hospital on patients with an ovarian mass
.
Int J Gynecol Cancer
2022
;
33
:
74
82
.
35.
Jacob
F
,
Meier
M
,
Caduff
R
,
Goldstein
D
,
Pochechueva
T
,
Hacker
N
, et al
.
No benefit from combining HE4 and CA125 as ovarian tumor markers in a clinical setting
.
Gynecol Oncol
2011
;
121
:
487
91
.
36.
Menon
U
,
Ryan
A
,
Kalsi
J
,
Gentry-Maharaj
A
,
Dawnay
A
,
Habib
M
, et al
.
Risk algorithm using serial biomarker measurements doubles the number of screen-detected cancers compared with a single-threshold rule in the United Kingdom collaborative trial of ovarian cancer screening
.
J Clin Oncol
2015
;
33
:
2062
71
.
37.
SEER*Explorer
.
An interactive website for SEER cancer statistics
[
Internet
].
Bethesda (MD)
:
Surveillance Research Program, National Cancer Institute
;
2023
.
[cited 2024 Mar 17]. Available from:
https://seer.cancer.gov/statistics-network/explorer/.
38.
Runowicz
CD
.
Ovarian cancer screening
. In:
Genetic susceptibility to breast and ovarian cancer
.
Bethesda (MD)
:
American College of Medical Genetics
;
1999
[cited 2024 Mar 17]. Available from:
https://www.ncbi.nlm.nih.gov/books/NBK56952/.
39.
Mathieu
KB
,
Bedi
DG
,
Thrower
SL
,
Qayyum
A
,
Bast
RC
Jr
.
Screening for ovarian cancer: imaging challenges and opportunities for improvement
.
Ultrasound Obstet Gynecol
2018
;
51
:
293
303
.
40.
Menon
U
,
Gentry-Maharaj
A
,
Burnell
M
,
Ryan
A
,
Singh
N
,
Manchanda
R
, et al
.
Tumour stage, treatment, and survival of women with high-grade serous tubo-ovarian cancer in UKCTOCS: an exploratory analysis of a randomised controlled trial
.
Lancet Oncol
2023
;
24
:
1018
28
.
41.
Buys
SS
,
Partridge
E
,
Black
A
,
Johnson
CC
,
Lamerato
L
,
Isaacs
C
, et al
.
Effect of screening on ovarian cancer mortality: the prostate, lung, colorectal and ovarian (PLCO) cancer screening randomized controlled trial
.
JAMA
2011
;
305
:
2295
303
.
42.
Connal
S
,
Cameron
JM
,
Sala
A
,
Brennan
PM
,
Palmer
DS
,
Palmer
JD
, et al
.
Liquid biopsies: the future of cancer early detection
.
J Transl Med
2023
;
21
:
118
.
43.
Cohen
JD
,
Li
L
,
Wang
Y
,
Thoburn
C
,
Afsari
B
,
Danilova
L
, et al
.
Detection and localization of surgically resectable cancers with a multi-analyte blood test
.
Science
2018
;
359
:
926
30
.
44.
Bach
PB
.
Late-stage cancer end points to speed cancer screening clinical trials-not so fast
.
JAMA
2024
;
331
:
1894
5
.
45.
Rasmussen
L
,
Wilhelmsen
M
,
Christensen
IJ
,
Andersen
J
,
Jørgensen
LN
,
Rasmussen
M
, et al
.
Protocol outlines for parts 1 and 2 of the prospective endoscopy III study for the early detection of colorectal cancer: validation of a concept based on blood biomarkers
.
JMIR Res Protoc
2016
;
5
:
e182
.
46.
Chen
S
,
Zhou
Y
,
Chen
Y
,
Gu
J
.
fastp: an ultra-fast all-in-one FASTQ preprocessor
.
Bioinformatics
2018
;
34
:
i884
90
.
47.
Langmead
B
,
Salzberg
SL
.
Fast gapped-read alignment with Bowtie 2
.
Nat Methods
2012
;
9
:
357
9
.
48.
Tarasov
A
,
Vilella
AJ
,
Cuppen
E
,
Nijman
IJ
,
Prins
P
.
Sambamba: fast processing of NGS alignment formats
.
Bioinformatics
2015
;
31
:
2032
4
.
49.
Quinlan
AR
,
Hall
IM
.
BEDTools: a flexible suite of utilities for comparing genomic features
.
Bioinformatics
2010
;
26
:
841
2
.
50.
Davoli
T
,
Uno
H
,
Wooten
EC
,
Elledge
SJ
.
Tumor aneuploidy correlates with markers of immune evasion and with reduced response to immunotherapy
.
Science
2017
;
355
:
eaaf8399
.
51.
Bokhorst
LP
,
Alberts
AR
,
Rannikko
A
,
Valdagni
R
,
Pickles
T
,
Kakehi
Y
, et al
.
Compliance rates with the prostate cancer research international active surveillance (PRIAS) protocol and disease reclassification in noncompliers
.
Eur Urol
2015
;
68
:
814
21
.
52.
Duffy
MJ
,
van Rossum
LGM
,
van Turenhout
ST
,
Malminiemi
O
,
Sturgeon
C
,
Lamerz
R
, et al
.
Use of faecal markers in screening for colorectal neoplasia: a European group on tumor markers position paper
.
Int J Cancer
2011
;
128
:
3
11
.
53.
Day
JC
.
Population projections of the United States by age, sex, race, and Hispanic origin: 1995 to 2050, U.S. Bureau of the Census, Current Population Reports
.
Washington (DC)
:
U.S. Government Printing Office
. p.
25
1130
.
This open access article is distributed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) license.

Supplementary data