Background:

Cure fraction—the proportion of persons considered cured of cancer after long-term follow-up—reflects the total impact of cancer control strategies, including screening, without lead-time bias. Previous studies have not reported stage-stratified cure fraction across the spectrum of cancer types.

Methods:

Using a mixture cure model, we estimated cure fraction across stages for 21 cancer types and additional subtypes. Cause-specific survival for 2.4 million incident cancers came from 17 US Surveillance, Epidemiology, and End Results registries for adults 40 to 84 years at diagnosis in 2006 to 2015, followed through 2020.

Results:

Across cancer types, a substantial cure fraction was evident at early stages, followed by either a sharp drop from stages III to IV or a steady decline from stages I to IV. For example, estimated cure fractions for colorectal cancer at stages I, II, III, and IV were 62% (95% confidence interval: 59%–66%), 61% (58%–65%), 58% (57%–59%), and 7% (7%–7%), respectively. Corresponding estimates for gallbladder cancer were 50% (46%–54%), 24% (22%–27%), 22% (19%–25%), and 2% (2%–3%). Differences in 5-year cause-specific survival between early-stage and stage IV cancers were highly correlated with between-stage differences in cure fraction, indicating that survival gaps by stage are persistent and not due to lead-time bias.

Conclusions:

A considerable fraction of cancer is amenable to cure at early stages, but not after metastasis.

Impact:

These results emphasize the potential for early detection of numerous cancers, including those with no current screening modalities, to reduce cancer death.

Cancer can be a survivable disease, especially when detected at an early stage. Cancer survivors currently comprise approximately 5% of the entire population in several western nations (1–4), and the percentage is expected to increase substantially in the coming years due to population aging (5, 6), advances in early detection and treatment (7), reduction of competing risks such as cardiovascular mortality (8), and overdiagnosis of nonlethal cancer (9).

Cure fraction is defined statistically as the proportion of diagnosed patients with cancer in a population who survive past the early peak in cancer-specific mortality to the point of attaining virtually the same risk of death as the age- and sex-matched general population with similar competing risks (10). Statistically, cured patients are those who achieve stable long-term survival with minimal risk of mortality from their initial cancer. For cancer types for which long-term risk of cancer death does not reach zero, cure fraction can still characterize the proportion of patients with this asymptotic level of risk (11–13). When used to evaluate cancer screening, cure fraction has the advantage of avoiding lead-time bias by virtue of its basis on long-term mortality data, as opposed to survival duration. By providing a measure of the overall effectiveness of cancer detection, treatment, and follow-up, cure fraction serves as a useful real-world indicator of long-term cancer outcomes in populations. For example, studies of cure fraction by cancer site have been used to monitor public health progress in cancer care, as well as to identify cancer types for which existing control strategies have had a limited effect (14–20).

Stage at diagnosis is among the strongest predictors of cancer outcomes, including survivorship trajectory and mortality risk, which differ starkly between localized and metastatic cancers (12). Accordingly, cure fraction is expected to vary substantially by both cancer type and stage. Understanding differences in cure fraction by cancer type and stage, in turn, can shed light on the potential for reducing the population burden of cancer through earlier detection, especially using new screening interventions (in addition to optimized treatment), by shifting the population proportion of later-stage to earlier-stage disease. This information also serves as quantitative evidence to help guide the allocation of public health resources for cancer prevention, detection, treatment, and management (21). To date, most studies of cure fraction have focused on individual cancer types or have not reported stage-specific cure fractions. Therefore, to better understand the potential for long-term cure through diagnosis at earlier stages across the spectrum of cancers, we undertook an analysis of cure fraction stratified by stage for all major stageable cancer types in the US population.

Data source

We obtained cause-specific survival data for this study from the US Surveillance, Epidemiology, and End Results (SEER) population-based cancer registries for 17 geographic regions from 2006 to 2015, with follow-up for mortality through 2020 (https://seer.cancer.gov/data/). These years were selected to enable uniform classification of cancer stage according to the 6th edition of the American Joint Committee on Cancer (AJCC) staging manual (22), and to enable a balance between recentness of data and availability of long-term follow-up. We included patients diagnosed with a first primary malignancy at ages 40 to 84 years, excluding those diagnosed only by death certificate or autopsy and those with missing/unknown data on age, vital status, survival time, or cause of death (1.5% of the starting cohort). Survival time was defined as the duration of the interval from diagnosis until death from the primary cancer (coded according to SEER's cause-specific death classification; ref. 23), death from another cause, or the end of follow-up on December 31, 2020.

Cases were grouped by primary cancer site using codes from the International Classification of Diseases for Oncology, 3rd edition (ICD-O-3), and by AJCC stage (I, II, III, IV, or missing/unknown). Pre-invasive/non-invasive tumors, such as in situ neoplasms, were excluded. Primary cancer types included in this analysis, defined by anatomic site, were anus, bladder, breast, cervix, colon/rectum, esophagus, gallbladder, head/neck, kidney, liver/intrahepatic bile duct, lung, lymphoma, melanoma, ovary, pancreas, prostate, sarcoma, stomach, thyroid, urothelial tract, and uterus. Brain/other nervous system cancers, leukemia, and myeloma lack AJCC 6th edition staging criteria, and therefore were excluded. Certain benign, rare, or hematopoietic histologies (ICD-O-3 histology codes 8710–8931, 9040–9055, 9120–9342, 9580–9992) were excluded from primary solid tumor sites. Breast cancers were further subclassified using SEER Extent of Disease codes as hormone receptor (HR)-positive (i.e., positive for estrogen receptor and/or progesterone receptor), HR-negative (i.e., negative for estrogen receptor and progesterone receptor), or unknown/unclassifiable HR status (i.e., neither HR-positive nor HR-negative). Lung cancers were further subclassified histologically as squamous cell carcinoma (ICD-O-3 histology codes 8050–8052, 8070–8076, 8078, 8083–8084, 8090, 8094, 8123), adenocarcinoma (ICD-O-3 histology codes 8140–8141, 8143–8145, 8147, 8190, 8201, 8211, 8250–8255, 8260, 8290, 8310, 8320, 8323, 8333, 8401, 8440, 8470–8471, 8480–8481, 8490, 8503, 8507, 8550, 8570–8572, 8574, 8576), small-cell carcinoma (ICD-O-3 histology codes 8002, 8041–8045), neuroendocrine (ICD-O-3 histology codes 8013, 8240–8243, 8246, 8249, with or without including small-cell carcinoma), and other histologic subtypes. We also subclassified esophageal and cervical cancers as squamous cell carcinoma (ICD-O-3 histology codes 8050–8123) or adenocarcinoma (ICD-O-3 histology codes 8140–8576), kidney cancers as renal parenchyma (ICD-O-3 site code C64.9) or renal pelvis (ICD-O-3 site code C65.9), and lymphomas as non-Hodgkin or Hodgkin (based on SEER Site Recode).

Life tables were generated for 5-year age bands from 40 to 84 years at diagnosis. The primary analysis was performed on individuals ages 40 to 79 years at diagnosis, covering the range where most population screening strategies occur. We performed secondary analyses aggregated by age bands at diagnosis (40–54, 55–64, 65–74, and 75–84 years) to examine age-related changes in the ability to reach long-term survival. To evaluate the potential role of surgery in attaining statistical cure, we used SEER Therapy coding for “cancer-directed surgery” to perform a secondary analysis among patients who received primary surgical treatment with curative intent.

This study was not subject to institutional review board approval or informed consent due to its secondary use of de-identified data.

Statistical methods

Using SEER*Stat software (https://seer.cancer.gov/seerstat/), we obtained standard life tables with cause-specific survival calculated using the actuarial method for monthly intervals up to 179 months of follow-up.

We used a mixture cure model (10, 24, 25), which divides cancer patients into two populations of “cured” and “uncured,” to estimate cure fraction by cancer type and stage, with no additional model covariates. “Uncured” patients were modeled using a two-parameter Weibull survival distribution, whereas “cured” patients were modeled as having a small exponential hazard (“residual excess risk”), thereby accounting for the possibility that complete cure (zero excess cancer-specific mortality risk) may not be possible or resolvable from the data for some cancers (11–13, 26). Given the residual excess risk, it may be more accurate to refer to “cured” patients as “long-term survivors” and “uncured” patients as “long-term nonsurvivors.”

We used the fact that cancer is a progressive disease to develop constraints on model parameters to stabilize fitting across all stages simultaneously, excluding unknown/missing stage. Specifically, the Weibull distribution for uncured patients in each stage was constrained so that long-term mortality was greater with later stage at diagnosis; the exponential hazard for cured patients in each stage was constrained so that later stages had higher potential for recurrence; and the cure fraction was constrained to decrease with later stage. These plausible constraints were implemented to enable the estimation of biologically realistic confidence intervals (CI) and median estimates that lay within feasible ranges, as compared with extremely wide CIs for early-stage cancers based on unconstrained models. We constrained the risk of recurrence to be relatively low per year, but did not require it to be zero (i.e., full statistical cure), thereby accommodating a continuing risk of recurrence for some cancers, such as lung.

We fit the model using a Markov chain Monte Carlo (MCMC) technique to optimize the likelihood of data given a parameter set, and to recover uncertainty in parameters given the data, while respecting the constraints by stage. We estimated 95% CIs using the sampled parameters of the MCMC fit. All analyses were conducted using R (https://www.r-project.org/) and Stan software (27) with the R package feather (28). Code and data are available at https://github.com/grailbio-publications/Hubbell_Cure_Fraction. Due to privacy concerns related to providing data with small numbers of events in some cells, we provide the specifications for the original SEER data draw, along with synthetic data generated to match the large-scale statistics to demonstrate the code. Figures and tables reported in this paper are from the original data only. Interested individuals can retrieve the original SEER data from the draw specifications.

Table 1 summarizes characteristics of the 2,391,278 cancer cases included in this analysis, stratified by stage at diagnosis. At the end of follow-up (maximum = 14 years and 11 months), 787,476 cases (32.9%) had died from their cancer, 347,794 (14.5%) had died from another cause, and 1,256,008 (52.5%) were still alive. Observed (not extrapolated) cumulative cause-specific mortality risk for the entire cohort at the end of follow-up was 12.0% for stage I cancers of all types combined, 14.5% for stage II, 45.8% for stage III, and 77.5% for stage IV.

Table 1.

Characteristics of cancer cases by stage at diagnosis, ages 40 to 84 years at diagnosis from 2006 to 2015, followed for mortality through 2020, SEER 17 registries.

Stage at diagnosis
IIIIIIIVUnknown/Missing
N (%)N (%)N (%)N (%)N (%)
Age at diagnosis (years) 
 40–44 43,508 (6.5%) 22,165 (3.2%) 14,065 (4.0%) 13,119 (2.8%) 8,907 (4.1%) 
 45–49 57,678 (8.6%) 41,342 (6.0%) 26,256 (7.5%) 27,583 (5.9%) 14,323 (6.6%) 
 50–54 80,652 (12.0%) 68,475 (9.9%) 39,948 (11.5%) 46,653 (10.0%) 22,329 (10.3%) 
 55–59 93,813 (14.0%) 99,071 (14.4%) 50,545 (14.5%) 64,673 (13.9%) 26,634 (12.2%) 
 60–64 101,534 (15.1%) 120,366 (17.5%) 56,032 (16.1%) 74,904 (16.1%) 29,816 (13.7%) 
 65–69 100,658 (15.0%) 126,285 (18.3%) 55,132 (15.8%) 74,289 (15.9%) 31,791 (14.6%) 
 70–74 81,107 (12.1%) 96,910 (14.1%) 44,447 (12.7%) 64,829 (13.9%) 29,405 (13.5%) 
 75–79 65,019 (9.7%) 69,913 (10.2%) 35,583 (10.2%) 56,050 (12.0%) 27,791 (12.8%) 
 80–84 47,081 (7.0%) 43,683 (6.3%) 26,604 (7.6%) 43,731 (9.4%) 26,579 (12.2%) 
Sex 
 Male 219,448 (32.7%) 461,583 (67.1%) 170,937 (49.0%) 266,071 (57.1%) 117,395 (54.0%) 
 Female 451,602 (67.3%) 226,627 (32.9%) 177,675 (51.0%) 199,760 (42.9%) 100,180 (46.0%) 
Race/ethnicity 
 White 496,340 (74.0%) 475,182 (69.0%) 241,833 (69.4%) 323,914 (69.5%) 140,900 (64.8%) 
 Black 50,723 (7.6%) 87,205 (12.7%) 37,795 (10.8%) 55,263 (11.9%) 23,968 (11.0%) 
 Hispanic 65,727 (9.8%) 72,241 (10.5%) 38,533 (11.1%) 49,065 (10.5%) 28,183 (13.0%) 
 Asian American/Pacific Islander 48,121 (7.2%) 43,947 (6.4%) 27,345 (7.8%) 33,693 (7.2%) 16,523 (7.6%) 
 American Indian/Alaska Native 3,754 (0.6%) 3,512 (0.5%) 2,289 (0.7%) 3,196 (0.7%) 1,451 (0.7%) 
 Other/unknown 6,385 (1.0%) 6,123 (0.9%) 817 (0.2%) 700 (0.2%) 6,550 (3.0%) 
Year of diagnosis 
 2006 59,440 (8.9%) 70,272 (10.2%) 32,955 (9.5%) 42,743 (9.2%) 22,768 (10.5%) 
 2007 61,714 (9.2%) 73,024 (10.6%) 33,929 (9.7%) 43,473 (9.3%) 23,152 (10.6%) 
 2008 63,804 (9.5%) 71,291 (10.4%) 34,453 (9.9%) 44,058 (9.5%) 22,692 (10.4%) 
 2009 65,785 (9.8%) 71,607 (10.4%) 35,070 (10.1%) 45,034 (9.7%) 22,857 (10.5%) 
 2010 65,506 (9.8%) 70,863 (10.3%) 34,534 (9.9%) 46,035 (9.9%) 20,919 (9.6%) 
 2011 66,510 (9.9%) 71,867 (10.4%) 34,697 (10.0%) 46,247 (9.9%) 21,363 (9.8%) 
 2012 69,274 (10.3%) 66,084 (9.6%) 34,959 (10.0%) 47,690 (10.2%) 20,584 (9.5%) 
 2013 70,401 (10.5%) 64,986 (9.4%) 34,929 (10.0%) 48,782 (10.5%) 20,823 (9.6%) 
 2014 73,304 (10.9%) 63,054 (9.2%) 35,795 (10.3%) 50,190 (10.8%) 20,955 (9.6%) 
 2015 75,312 (11.2%) 65,162 (9.5%) 37,291 (10.7%) 51,579 (11.1%) 21,462 (9.9%) 
Receipt of definitive surgery 
 Surgery performed 593,087 (88.4%) 414,166 (60.2%) 227,223 (65.2%) 109,199 (23.4%) 83,317 (38.3%) 
 Surgery recommended, not performed 6,730 (1.0%) 23,475 (3.4%) 5,149 (1.5%) 12,855 (2.8%) 17,524 (8.1%) 
 Surgery recommended, unknown if performed 1,869 (0.3%) 4,282 (0.6%) 1,571 (0.5%) 1,669 (0.4%) 1,893 (0.9%) 
 Surgery not recommended 68,462 (10.2%) 244,222 (35.5%) 114,134 (32.7%) 340,486 (73.1%) 104,657 (48.1%) 
 Unknown 902 (0.1%) 2,065 (0.3%) 535 (0.2%) 1,622 (0.3%) 10,184 (4.7%) 
Vital status at study end 
 Alive 483,133 (72.0%) 468,806 (68.1%) 144,299 (41.4%) 65,540 (14.1%) 94,230 (43.3%) 
 Dead from index cancer 80,238 (12.0%) 99,615 (14.5%) 159,820 (45.8%) 361,070 (77.5%) 86,733 (39.9%) 
 Dead from other cause 107,679 (16.0%) 119,789 (17.4%) 44,493 (12.8%) 39,221 (8.4%) 36,612 (16.8%) 
Median follow-up (range: 0–179 months) 
 Alive 108 months 114 months 106 months 96 months 106 months 
 Dead from primary cancer 28 months 28 months 16 months 6 months 9 months 
 Dead from other cause 63 months 70 months 47 months 16 months 44 months 
Primary cancer type 
 Anus 2,035 (0.3%) 3,141 (0.5%) 2,946 (0.8%) 888 (0.2%) 2,573 (1.2%) 
 Bladder 22,070 (3.3%) 10,552 (1.5%) 4,163 (1.2%) 7,795 (1.7%) 3,771 (1.7%) 
 Breast, all 195,164 (29.1%) 139,695 (20.3%) 45,941 (13.2%) 22,166 (4.8%) 17,354 (8.0%) 
 Breast, HR-positive (female) 166,473 (24.8%) 107,848 (15.7%) 33,364 (9.6%) 14,878 (3.2%) 8,772 (4.0%) 
 Breast, HR-negative (female) 22,658 (3.4%) 26,609 (3.9%) 10,742 (3.1%) 4,581 (1.0%) 2,228 (1.0%) 
 Breast, HR-unknown (female) 5,163 (0.8%) 3,988 (0.6%) 1,343 (0.4%) 2,464 (0.5%) 6,189 (2.8%) 
 Cervix, all 8,401 (1.3%) 3,017 (0.4%) 4,615 (1.3%) 3,493 (0.7%) 1,603 (0.7%) 
 Cervix, adenocarcinoma 3,031 (0.5%) 664 (0.1%) 892 (0.3%) 879 (0.2%) 308 (0.1%) 
 Cervix, squamous cell carcinoma 5,185 (0.8%) 2,279 (0.3%) 3,571 (1.0%) 2,244 (0.5%) 759 (0.3%) 
 Colon/rectum 52,670 (7.8%) 54,481 (7.9%) 59,488 (17.1%) 48,604 (10.4%) 21,069 (9.7%) 
 Esophagus, all 3,620 (0.5%) 4,539 (0.7%) 5,234 (1.5%) 9,300 (2.0%) 3,482 (1.6%) 
 Esophagus, adenocarcinoma 2,508 (0.4%) 2,847 (0.4%) 3,220 (0.9%) 6,383 (1.4%) 1,616 (0.7%) 
 Esophagus, squamous cell carcinoma 947 (0.1%) 1,577 (0.2%) 1,866 (0.5%) 2,298 (0.5%) 1,252 (0.6%) 
 Gallbladder 3,775 (0.6%) 4,421 (0.6%) 1,433 (0.4%) 5,770 (1.2%) 2,118 (1.0%) 
 Head/neck 18,148 (2.7%) 10,213 (1.5%) 12,999 (3.7%) 38,029 (8.2%) 12,262 (5.6%) 
 Kidney, all 49,596 (7.4%) 7,542 (1.1%) 11,631 (3.3%) 14,017 (3.0%) 4,381 (2.0%) 
 Kidney, renal parenchyma 49,878 (7.4%) 7,647 (1.1%) 11,815 (3.4%) 14,408 (3.1%) 4,695 (2.2%) 
 Kidney, renal pelvis 1,114 (0.2%) 385 (0.1%) 1,184 (0.3%) 1,380 (0.3%) 246 (0.1%) 
 Liver/intrahepatic bile duct 16,929 (2.5%) 8,778 (1.3%) 11,196 (3.2%) 9,743 (2.1%) 10,717 (4.9%) 
 Lung, all 57,588 (8.6%) 13,152 (1.9%) 74,726 (21.4%) 151,123 (32.4%) 27,187 (12.5%) 
 Lung, adenocarcinoma 30,175 (4.5%) 5,147 (0.7%) 26,625 (7.6%) 61,133 (13.1%) 5,212 (2.4%) 
 Lung, squamous cell carcinoma 15,821 (2.4%) 4,752 (0.7%) 19,985 (5.7%) 21,261 (4.6%) 3,678 (1.7%) 
 Lung, small cell carcinoma 1,663 (0.2%) 796 (0.1%) 11,765 (3.4%) 27,363 (5.9%) 2,206 (1.0%) 
 Lung, neuroendocrine excluding small cell 1,413 (0.2%) 324 (0.0%) 1,374 (0.4%) 4,190 (0.9%) 3,954 (1.8%) 
 Lung, other 8,516 (1.3%) 2,133 (0.3%) 14,977 (4.3%) 37,176 (8.0%) 12,137 (5.6%) 
 Lymphoma, all 30,161 (4.5%) 17,506 (2.5%) 19,194 (5.5%) 38,317 (8.2%) 11,151 (5.1%) 
 Lymphoma, Hodgkin 1,576 (0.2%) 2,621 (0.4%) 1,996 (0.6%) 1,965 (0.4%) 472 (0.2%) 
 Lymphoma, non-Hodgkin 29,277 (4.4%) 15,375 (2.2%) 18,073 (5.2%) 37,500 (8.1%) 8,611 (4.0%) 
 Melanoma 82,183 (12.2%) 14,021 (2.0%) 8,138 (2.3%) 4,442 (1.0%) 12,583 (5.8%) 
 Ovary 8,349 (1.2%) 3,425 (0.5%) 14,709 (4.2%) 11,031 (2.4%) 7,427 (3.4%) 
 Pancreas 4,513 (0.7%) 16,598 (2.4%) 5,773 (1.7%) 35,407 (7.6%) 8,985 (4.1%) 
 Prostate 1,408 (0.2%) 356,460 (51.8%) 34,557 (9.9%) 31,598 (6.8%) 33,105 (15.2%) 
 Sarcoma 4,827 (0.7%) 2,360 (0.3%) 2,983 (0.9%) 2,882 (0.6%) 17,546 (8.1%) 
 Stomach 8,919 (1.3%) 4,415 (0.6%) 3,987 (1.1%) 15,814 (3.4%) 8,958 (4.1%) 
 Thyroid 39,084 (5.8%) 6,882 (1.0%) 11,217 (3.2%) 6,989 (1.5%) 3,892 (1.8%) 
 Urothelial tract 2,069 (0.3%) 914 (0.1%) 1,946 (0.6%) 2,363 (0.5%) 545 (0.3%) 
 Uterus 59,541 (8.9%) 6,098 (0.9%) 11,736 (3.4%) 6,060 (1.3%) 6,866 (3.2%) 
Stage at diagnosis
IIIIIIIVUnknown/Missing
N (%)N (%)N (%)N (%)N (%)
Age at diagnosis (years) 
 40–44 43,508 (6.5%) 22,165 (3.2%) 14,065 (4.0%) 13,119 (2.8%) 8,907 (4.1%) 
 45–49 57,678 (8.6%) 41,342 (6.0%) 26,256 (7.5%) 27,583 (5.9%) 14,323 (6.6%) 
 50–54 80,652 (12.0%) 68,475 (9.9%) 39,948 (11.5%) 46,653 (10.0%) 22,329 (10.3%) 
 55–59 93,813 (14.0%) 99,071 (14.4%) 50,545 (14.5%) 64,673 (13.9%) 26,634 (12.2%) 
 60–64 101,534 (15.1%) 120,366 (17.5%) 56,032 (16.1%) 74,904 (16.1%) 29,816 (13.7%) 
 65–69 100,658 (15.0%) 126,285 (18.3%) 55,132 (15.8%) 74,289 (15.9%) 31,791 (14.6%) 
 70–74 81,107 (12.1%) 96,910 (14.1%) 44,447 (12.7%) 64,829 (13.9%) 29,405 (13.5%) 
 75–79 65,019 (9.7%) 69,913 (10.2%) 35,583 (10.2%) 56,050 (12.0%) 27,791 (12.8%) 
 80–84 47,081 (7.0%) 43,683 (6.3%) 26,604 (7.6%) 43,731 (9.4%) 26,579 (12.2%) 
Sex 
 Male 219,448 (32.7%) 461,583 (67.1%) 170,937 (49.0%) 266,071 (57.1%) 117,395 (54.0%) 
 Female 451,602 (67.3%) 226,627 (32.9%) 177,675 (51.0%) 199,760 (42.9%) 100,180 (46.0%) 
Race/ethnicity 
 White 496,340 (74.0%) 475,182 (69.0%) 241,833 (69.4%) 323,914 (69.5%) 140,900 (64.8%) 
 Black 50,723 (7.6%) 87,205 (12.7%) 37,795 (10.8%) 55,263 (11.9%) 23,968 (11.0%) 
 Hispanic 65,727 (9.8%) 72,241 (10.5%) 38,533 (11.1%) 49,065 (10.5%) 28,183 (13.0%) 
 Asian American/Pacific Islander 48,121 (7.2%) 43,947 (6.4%) 27,345 (7.8%) 33,693 (7.2%) 16,523 (7.6%) 
 American Indian/Alaska Native 3,754 (0.6%) 3,512 (0.5%) 2,289 (0.7%) 3,196 (0.7%) 1,451 (0.7%) 
 Other/unknown 6,385 (1.0%) 6,123 (0.9%) 817 (0.2%) 700 (0.2%) 6,550 (3.0%) 
Year of diagnosis 
 2006 59,440 (8.9%) 70,272 (10.2%) 32,955 (9.5%) 42,743 (9.2%) 22,768 (10.5%) 
 2007 61,714 (9.2%) 73,024 (10.6%) 33,929 (9.7%) 43,473 (9.3%) 23,152 (10.6%) 
 2008 63,804 (9.5%) 71,291 (10.4%) 34,453 (9.9%) 44,058 (9.5%) 22,692 (10.4%) 
 2009 65,785 (9.8%) 71,607 (10.4%) 35,070 (10.1%) 45,034 (9.7%) 22,857 (10.5%) 
 2010 65,506 (9.8%) 70,863 (10.3%) 34,534 (9.9%) 46,035 (9.9%) 20,919 (9.6%) 
 2011 66,510 (9.9%) 71,867 (10.4%) 34,697 (10.0%) 46,247 (9.9%) 21,363 (9.8%) 
 2012 69,274 (10.3%) 66,084 (9.6%) 34,959 (10.0%) 47,690 (10.2%) 20,584 (9.5%) 
 2013 70,401 (10.5%) 64,986 (9.4%) 34,929 (10.0%) 48,782 (10.5%) 20,823 (9.6%) 
 2014 73,304 (10.9%) 63,054 (9.2%) 35,795 (10.3%) 50,190 (10.8%) 20,955 (9.6%) 
 2015 75,312 (11.2%) 65,162 (9.5%) 37,291 (10.7%) 51,579 (11.1%) 21,462 (9.9%) 
Receipt of definitive surgery 
 Surgery performed 593,087 (88.4%) 414,166 (60.2%) 227,223 (65.2%) 109,199 (23.4%) 83,317 (38.3%) 
 Surgery recommended, not performed 6,730 (1.0%) 23,475 (3.4%) 5,149 (1.5%) 12,855 (2.8%) 17,524 (8.1%) 
 Surgery recommended, unknown if performed 1,869 (0.3%) 4,282 (0.6%) 1,571 (0.5%) 1,669 (0.4%) 1,893 (0.9%) 
 Surgery not recommended 68,462 (10.2%) 244,222 (35.5%) 114,134 (32.7%) 340,486 (73.1%) 104,657 (48.1%) 
 Unknown 902 (0.1%) 2,065 (0.3%) 535 (0.2%) 1,622 (0.3%) 10,184 (4.7%) 
Vital status at study end 
 Alive 483,133 (72.0%) 468,806 (68.1%) 144,299 (41.4%) 65,540 (14.1%) 94,230 (43.3%) 
 Dead from index cancer 80,238 (12.0%) 99,615 (14.5%) 159,820 (45.8%) 361,070 (77.5%) 86,733 (39.9%) 
 Dead from other cause 107,679 (16.0%) 119,789 (17.4%) 44,493 (12.8%) 39,221 (8.4%) 36,612 (16.8%) 
Median follow-up (range: 0–179 months) 
 Alive 108 months 114 months 106 months 96 months 106 months 
 Dead from primary cancer 28 months 28 months 16 months 6 months 9 months 
 Dead from other cause 63 months 70 months 47 months 16 months 44 months 
Primary cancer type 
 Anus 2,035 (0.3%) 3,141 (0.5%) 2,946 (0.8%) 888 (0.2%) 2,573 (1.2%) 
 Bladder 22,070 (3.3%) 10,552 (1.5%) 4,163 (1.2%) 7,795 (1.7%) 3,771 (1.7%) 
 Breast, all 195,164 (29.1%) 139,695 (20.3%) 45,941 (13.2%) 22,166 (4.8%) 17,354 (8.0%) 
 Breast, HR-positive (female) 166,473 (24.8%) 107,848 (15.7%) 33,364 (9.6%) 14,878 (3.2%) 8,772 (4.0%) 
 Breast, HR-negative (female) 22,658 (3.4%) 26,609 (3.9%) 10,742 (3.1%) 4,581 (1.0%) 2,228 (1.0%) 
 Breast, HR-unknown (female) 5,163 (0.8%) 3,988 (0.6%) 1,343 (0.4%) 2,464 (0.5%) 6,189 (2.8%) 
 Cervix, all 8,401 (1.3%) 3,017 (0.4%) 4,615 (1.3%) 3,493 (0.7%) 1,603 (0.7%) 
 Cervix, adenocarcinoma 3,031 (0.5%) 664 (0.1%) 892 (0.3%) 879 (0.2%) 308 (0.1%) 
 Cervix, squamous cell carcinoma 5,185 (0.8%) 2,279 (0.3%) 3,571 (1.0%) 2,244 (0.5%) 759 (0.3%) 
 Colon/rectum 52,670 (7.8%) 54,481 (7.9%) 59,488 (17.1%) 48,604 (10.4%) 21,069 (9.7%) 
 Esophagus, all 3,620 (0.5%) 4,539 (0.7%) 5,234 (1.5%) 9,300 (2.0%) 3,482 (1.6%) 
 Esophagus, adenocarcinoma 2,508 (0.4%) 2,847 (0.4%) 3,220 (0.9%) 6,383 (1.4%) 1,616 (0.7%) 
 Esophagus, squamous cell carcinoma 947 (0.1%) 1,577 (0.2%) 1,866 (0.5%) 2,298 (0.5%) 1,252 (0.6%) 
 Gallbladder 3,775 (0.6%) 4,421 (0.6%) 1,433 (0.4%) 5,770 (1.2%) 2,118 (1.0%) 
 Head/neck 18,148 (2.7%) 10,213 (1.5%) 12,999 (3.7%) 38,029 (8.2%) 12,262 (5.6%) 
 Kidney, all 49,596 (7.4%) 7,542 (1.1%) 11,631 (3.3%) 14,017 (3.0%) 4,381 (2.0%) 
 Kidney, renal parenchyma 49,878 (7.4%) 7,647 (1.1%) 11,815 (3.4%) 14,408 (3.1%) 4,695 (2.2%) 
 Kidney, renal pelvis 1,114 (0.2%) 385 (0.1%) 1,184 (0.3%) 1,380 (0.3%) 246 (0.1%) 
 Liver/intrahepatic bile duct 16,929 (2.5%) 8,778 (1.3%) 11,196 (3.2%) 9,743 (2.1%) 10,717 (4.9%) 
 Lung, all 57,588 (8.6%) 13,152 (1.9%) 74,726 (21.4%) 151,123 (32.4%) 27,187 (12.5%) 
 Lung, adenocarcinoma 30,175 (4.5%) 5,147 (0.7%) 26,625 (7.6%) 61,133 (13.1%) 5,212 (2.4%) 
 Lung, squamous cell carcinoma 15,821 (2.4%) 4,752 (0.7%) 19,985 (5.7%) 21,261 (4.6%) 3,678 (1.7%) 
 Lung, small cell carcinoma 1,663 (0.2%) 796 (0.1%) 11,765 (3.4%) 27,363 (5.9%) 2,206 (1.0%) 
 Lung, neuroendocrine excluding small cell 1,413 (0.2%) 324 (0.0%) 1,374 (0.4%) 4,190 (0.9%) 3,954 (1.8%) 
 Lung, other 8,516 (1.3%) 2,133 (0.3%) 14,977 (4.3%) 37,176 (8.0%) 12,137 (5.6%) 
 Lymphoma, all 30,161 (4.5%) 17,506 (2.5%) 19,194 (5.5%) 38,317 (8.2%) 11,151 (5.1%) 
 Lymphoma, Hodgkin 1,576 (0.2%) 2,621 (0.4%) 1,996 (0.6%) 1,965 (0.4%) 472 (0.2%) 
 Lymphoma, non-Hodgkin 29,277 (4.4%) 15,375 (2.2%) 18,073 (5.2%) 37,500 (8.1%) 8,611 (4.0%) 
 Melanoma 82,183 (12.2%) 14,021 (2.0%) 8,138 (2.3%) 4,442 (1.0%) 12,583 (5.8%) 
 Ovary 8,349 (1.2%) 3,425 (0.5%) 14,709 (4.2%) 11,031 (2.4%) 7,427 (3.4%) 
 Pancreas 4,513 (0.7%) 16,598 (2.4%) 5,773 (1.7%) 35,407 (7.6%) 8,985 (4.1%) 
 Prostate 1,408 (0.2%) 356,460 (51.8%) 34,557 (9.9%) 31,598 (6.8%) 33,105 (15.2%) 
 Sarcoma 4,827 (0.7%) 2,360 (0.3%) 2,983 (0.9%) 2,882 (0.6%) 17,546 (8.1%) 
 Stomach 8,919 (1.3%) 4,415 (0.6%) 3,987 (1.1%) 15,814 (3.4%) 8,958 (4.1%) 
 Thyroid 39,084 (5.8%) 6,882 (1.0%) 11,217 (3.2%) 6,989 (1.5%) 3,892 (1.8%) 
 Urothelial tract 2,069 (0.3%) 914 (0.1%) 1,946 (0.6%) 2,363 (0.5%) 545 (0.3%) 
 Uterus 59,541 (8.9%) 6,098 (0.9%) 11,736 (3.4%) 6,060 (1.3%) 6,866 (3.2%) 

Abbreviation: HR, hormone receptor.

Table 2 provides the fitted long-term cure fraction for each cancer type by stage. Adding a small residual excess risk of death (≤4%) for long-term survivors considered to be cured noticeably improved the likelihood of the model fit, consistent with prior studies of long-term cancer recurrence and relapse. Cure fractions for nine illustrative cancer types are shown in Fig. 1, and those for all 21 cancer types, as well as major subtypes, are provided in Supplementary Fig. S1. These results show that cure fractions vary among cancer types and stages at diagnosis, with most cancers following one of two stage-specific behavior patterns. The first pattern, exemplified by colorectal cancer, shows high cure fraction at all stages before metastasis, followed by a precipitous decline from stage III (58% cure; 95% CI, 57%–59%) to stage IV (7% cure; 95% CI, 7%–7%). The second pattern, illustrated by gallbladder cancer as an example, shows a steady decrease in cure fraction at each stage, from 50% at stage I (95% CI, 46%–54%) to 24% (95% CI, 22%–27%) at stage II, 22% (95% CI, 19%–25%) at stage III, and 2% (95% CI, 2%–3%) at stage IV. For most cancers, cure fraction was appreciably higher at stages I and II than at stages III and IV (usually by >50% for stage I versus IV).

Table 2.

Cure fraction (with 95% confidence interval) and residual excess risk by cancer type and stage at diagnosis among cases aged 40 to 79 years at diagnosis from 2006 to 2015, followed for mortality through 2020, SEER 17 registries.

Cure fraction (95% confidence interval)Residual excess risk (%)
Cancer typeStage IStage IIStage IIIStage IVStage IStage IIStage IIIStage IV
Anus 93 (87–95) 85 (82–87) 70 (68–72) 27 (23–31) 0.993 0.991 0.990 0.982 
Bladder 92 (90–94) 68 (66–69) 54 (52–57) 19 (17–20) 0.985 0.980 0.977 0.972 
Breast, all 88 (83–92) 82 (81–83) 59 (57–61) 6 (5–8) 1.000 1.000 0.999 0.998 
Breast, HR-positive (female) 87 (81–91) 82 (79–84) 60 (58–63) 5 (3–7) 1.000 0.999 0.998 0.997 
Breast, HR-negative (female) 96 (95–96) 85 (85–86) 60 (59–62) 23 (21–25) 0.997 0.995 0.991 0.962 
Breast, HR-unknown (female) 95 (81–100) 85 (77–89) 56 (47–66) 1 (0–4) 0.997 0.995 0.991 0.978 
Cervix, all 94 (93–95) 76 (73–78) 57 (55–59) 23 (21–25) 0.994 0.987 0.982 0.964 
Cervix, adenocarcinoma 94 (87–97) 62 (56–69) 52 (45–59) 21 (17–25) 0.995 0.991 0.983 0.977 
Cervix, squamous cell 94 (92–95) 79 (76–81) 59 (56–61) 24 (22–27) 0.994 0.988 0.984 0.963 
Colon/rectum 62 (59–66) 61 (58–65) 58 (57–59) 7 (7–7) 1.000 1.000 1.000 1.000 
Esophagus, all 60 (56–63) 43 (41–45) 27 (26–29) 5 (5–6) 0.984 0.967 0.963 0.962 
Esophagus, adenocarcinoma 69 (63–74) 43 (41–46) 26 (24–29) 5 (5–6) 0.985 0.970 0.966 0.963 
Esophagus, squamous cell 41 (37–44) 40 (36–43) 28 (25–31) 7 (6–8) 0.970 0.967 0.963 0.962 
Gallbladder 50 (46–54) 24 (22–27) 22 (19–25) 2 (2–3) 0.994 0.984 0.978 0.965 
Head/neck 95 (94–96) 89 (88–90) 79 (78–80) 64 (64–65) 0.988 0.978 0.976 0.974 
Kidney, all 100 (100–100) 96 (93–98) 89 (87–91) 14 (13–15) 0.993 0.980 0.969 0.961 
Kidney, renal parenchyma 100 (100–100) 97 (94–99) 89 (86–90) 15 (14–16) 0.992 0.979 0.969 0.961 
Kidney, renal pelvis 80 (71–89) 76 (67–83) 65 (59–70) 16 (13–20) 0.993 0.991 0.987 0.980 
Liver/intrahepatic bile duct 32 (30–34) 32 (29–34) 8 (8–9) 3 (2–3) 0.992 0.991 0.964 0.962 
Lung, all 74 (71–76) 47 (45–48) 21 (20–21) 5 (5–5) 0.969 0.962 0.961 0.961 
Lung, adenocarcinoma 82 (75–85) 50 (46–53) 23 (22–24) 6 (6–7) 0.973 0.965 0.962 0.961 
Lung, squamous cell 67 (63–69) 46 (44–48) 21 (20–22) 6 (5–6) 0.963 0.962 0.961 0.961 
Lung, small cell 43 (38–46) 30 (26–35) 16 (15–17) 3 (2–3) 0.966 0.963 0.962 0.961 
Lung, neuroendocrine excl. small cell 79 (75–83) 55 (48–63) 27 (24–30) 6 (5–7) 0.983 0.975 0.965 0.962 
Lung, neuroendocrine incl. small cell 62 (59–65) 40 (36–44) 18 (17–19) 3 (3–3) 0.974 0.965 0.962 0.961 
Lung, other 51 (44–56) 36 (32–39) 18 (17–19) 4 (4–5) 0.974 0.971 0.961 0.961 
Lymphoma, all 91 (90–91) 86 (85–87) 81 (80–82) 74 (73–75) 0.990 0.989 0.983 0.980 
Lymphoma, Hodgkin 93 (92–95) 93 (91–94) 84 (82–86) 76 (73–78) 0.994 0.994 0.989 0.986 
Lymphoma, non-Hodgkin 91 (90–91) 85 (84–86) 81 (80–82) 74 (73–75) 0.990 0.988 0.982 0.979 
Melanoma 95 (95–96) 83 (80–85) 67 (65–68) 27 (25–29) 1.000 0.991 0.983 0.964 
Ovary 79 (75–82) 52 (46–57) 17 (16–19) 5 (4–6) 1.000 1.000 1.000 0.998 
Pancreas 43 (41–46) 16 (15–17) 4 (4–5) 3 (2–3) 0.977 0.962 0.961 0.961 
Prostate 78 (72–81) 76 (71–79) 71 (66–76) 35 (33–60) 1.000 1.000 0.999 0.997 
Sarcoma 87 (75–95) 75 (70–78) 56 (52–59) 14 (12–16) 0.994 0.991 0.977 0.966 
Stomach 68 (66–70) 44 (42–47) 29 (27–31) 5 (5–6) 0.989 0.977 0.973 0.965 
Thyroid 99 (93–100) 97 (90–100) 94 (86–98) 85 (84–86) 1.000 0.999 0.999 0.980 
Urothelial tract 81 (70–89) 72 (65–77) 58 (54–62) 17 (15–20) 0.992 0.988 0.985 0.978 
Uterus 96 (95–96) 84 (83–86) 68 (66–70) 19 (17–21) 0.998 0.996 0.991 0.980 
Cure fraction (95% confidence interval)Residual excess risk (%)
Cancer typeStage IStage IIStage IIIStage IVStage IStage IIStage IIIStage IV
Anus 93 (87–95) 85 (82–87) 70 (68–72) 27 (23–31) 0.993 0.991 0.990 0.982 
Bladder 92 (90–94) 68 (66–69) 54 (52–57) 19 (17–20) 0.985 0.980 0.977 0.972 
Breast, all 88 (83–92) 82 (81–83) 59 (57–61) 6 (5–8) 1.000 1.000 0.999 0.998 
Breast, HR-positive (female) 87 (81–91) 82 (79–84) 60 (58–63) 5 (3–7) 1.000 0.999 0.998 0.997 
Breast, HR-negative (female) 96 (95–96) 85 (85–86) 60 (59–62) 23 (21–25) 0.997 0.995 0.991 0.962 
Breast, HR-unknown (female) 95 (81–100) 85 (77–89) 56 (47–66) 1 (0–4) 0.997 0.995 0.991 0.978 
Cervix, all 94 (93–95) 76 (73–78) 57 (55–59) 23 (21–25) 0.994 0.987 0.982 0.964 
Cervix, adenocarcinoma 94 (87–97) 62 (56–69) 52 (45–59) 21 (17–25) 0.995 0.991 0.983 0.977 
Cervix, squamous cell 94 (92–95) 79 (76–81) 59 (56–61) 24 (22–27) 0.994 0.988 0.984 0.963 
Colon/rectum 62 (59–66) 61 (58–65) 58 (57–59) 7 (7–7) 1.000 1.000 1.000 1.000 
Esophagus, all 60 (56–63) 43 (41–45) 27 (26–29) 5 (5–6) 0.984 0.967 0.963 0.962 
Esophagus, adenocarcinoma 69 (63–74) 43 (41–46) 26 (24–29) 5 (5–6) 0.985 0.970 0.966 0.963 
Esophagus, squamous cell 41 (37–44) 40 (36–43) 28 (25–31) 7 (6–8) 0.970 0.967 0.963 0.962 
Gallbladder 50 (46–54) 24 (22–27) 22 (19–25) 2 (2–3) 0.994 0.984 0.978 0.965 
Head/neck 95 (94–96) 89 (88–90) 79 (78–80) 64 (64–65) 0.988 0.978 0.976 0.974 
Kidney, all 100 (100–100) 96 (93–98) 89 (87–91) 14 (13–15) 0.993 0.980 0.969 0.961 
Kidney, renal parenchyma 100 (100–100) 97 (94–99) 89 (86–90) 15 (14–16) 0.992 0.979 0.969 0.961 
Kidney, renal pelvis 80 (71–89) 76 (67–83) 65 (59–70) 16 (13–20) 0.993 0.991 0.987 0.980 
Liver/intrahepatic bile duct 32 (30–34) 32 (29–34) 8 (8–9) 3 (2–3) 0.992 0.991 0.964 0.962 
Lung, all 74 (71–76) 47 (45–48) 21 (20–21) 5 (5–5) 0.969 0.962 0.961 0.961 
Lung, adenocarcinoma 82 (75–85) 50 (46–53) 23 (22–24) 6 (6–7) 0.973 0.965 0.962 0.961 
Lung, squamous cell 67 (63–69) 46 (44–48) 21 (20–22) 6 (5–6) 0.963 0.962 0.961 0.961 
Lung, small cell 43 (38–46) 30 (26–35) 16 (15–17) 3 (2–3) 0.966 0.963 0.962 0.961 
Lung, neuroendocrine excl. small cell 79 (75–83) 55 (48–63) 27 (24–30) 6 (5–7) 0.983 0.975 0.965 0.962 
Lung, neuroendocrine incl. small cell 62 (59–65) 40 (36–44) 18 (17–19) 3 (3–3) 0.974 0.965 0.962 0.961 
Lung, other 51 (44–56) 36 (32–39) 18 (17–19) 4 (4–5) 0.974 0.971 0.961 0.961 
Lymphoma, all 91 (90–91) 86 (85–87) 81 (80–82) 74 (73–75) 0.990 0.989 0.983 0.980 
Lymphoma, Hodgkin 93 (92–95) 93 (91–94) 84 (82–86) 76 (73–78) 0.994 0.994 0.989 0.986 
Lymphoma, non-Hodgkin 91 (90–91) 85 (84–86) 81 (80–82) 74 (73–75) 0.990 0.988 0.982 0.979 
Melanoma 95 (95–96) 83 (80–85) 67 (65–68) 27 (25–29) 1.000 0.991 0.983 0.964 
Ovary 79 (75–82) 52 (46–57) 17 (16–19) 5 (4–6) 1.000 1.000 1.000 0.998 
Pancreas 43 (41–46) 16 (15–17) 4 (4–5) 3 (2–3) 0.977 0.962 0.961 0.961 
Prostate 78 (72–81) 76 (71–79) 71 (66–76) 35 (33–60) 1.000 1.000 0.999 0.997 
Sarcoma 87 (75–95) 75 (70–78) 56 (52–59) 14 (12–16) 0.994 0.991 0.977 0.966 
Stomach 68 (66–70) 44 (42–47) 29 (27–31) 5 (5–6) 0.989 0.977 0.973 0.965 
Thyroid 99 (93–100) 97 (90–100) 94 (86–98) 85 (84–86) 1.000 0.999 0.999 0.980 
Urothelial tract 81 (70–89) 72 (65–77) 58 (54–62) 17 (15–20) 0.992 0.988 0.985 0.978 
Uterus 96 (95–96) 84 (83–86) 68 (66–70) 19 (17–21) 0.998 0.996 0.991 0.980 

Residual excess risk is the small incremental increased risk of death in long-term cancer survivors compared with the general population. HR, hormone receptor.

Figure 1.

Cure fraction (with 95% confidence interval indicated by error bars) by cancer type and stage at diagnosis for nine illustrative cancer types (breast, colon/rectum, esophagus, gallbladder, liver/intrahepatic bile duct, lung, ovary, pancreas, and stomach) among cases aged 40 to 79 years at diagnosis from 2006 to 2015, followed for mortality through 2020, SEER 17 registries. LTS, long-term survival (i.e., “cure fraction”).

Figure 1.

Cure fraction (with 95% confidence interval indicated by error bars) by cancer type and stage at diagnosis for nine illustrative cancer types (breast, colon/rectum, esophagus, gallbladder, liver/intrahepatic bile duct, lung, ovary, pancreas, and stomach) among cases aged 40 to 79 years at diagnosis from 2006 to 2015, followed for mortality through 2020, SEER 17 registries. LTS, long-term survival (i.e., “cure fraction”).

Close modal

The mixture cure model allocates individuals to one of two populations: those not cured, who are at acute risk of cancer mortality; and those cured, who are at the asymptotic risk of cancer recurrence and mortality. In scenarios where few deaths occur, these populations are difficult to distinguish by observed events. In such instances, a relatively low cure fraction may not reflect the number of long-term survivors, as even the not-cured fraction experiences a low cancer mortality rate.

Figure 2 illustrates the correlation between the difference in 5-year cause-specific survival between stage I, II, or III and stage IV for each cancer type, and the difference in cure fraction for the same between-stage comparisons by type. The close correlation between these two metrics (ρ = 0.968) indicates that 5-year survival difference by stage is a strong predictor of stage-specific long-term cure fraction for each cancer type. Thus, for example, if 5-year cause-specific survival for a given cancer type is 60% at stage I and 20% at stage IV, then there will be approximately 40% more patients considered cured at stage I than stage IV. Continuing with this example, under the common scenario of essentially no long-term survivors at stage IV (i.e., cure fraction = 0%), then stage I would be predicted to yield approximately 40% patients cured.

Figure 2.

Correlation (ρ = 0.968) between stage-specific differences in five-year cancer specific survival from stage IV (i.e., stage IV vs. stages I, II, and III) and stage-specific differences in cure fraction from stage IV by cancer type among cases aged 40 to 79 years at diagnosis from 2006 to 2015, followed for mortality through 2020, SEER 17 registries. Differences between stage IV and stage I are denoted by circles; differences between stage IV and stage II are denoted by triangles; and differences between stage IV and stage III are denoted by squares.

Figure 2.

Correlation (ρ = 0.968) between stage-specific differences in five-year cancer specific survival from stage IV (i.e., stage IV vs. stages I, II, and III) and stage-specific differences in cure fraction from stage IV by cancer type among cases aged 40 to 79 years at diagnosis from 2006 to 2015, followed for mortality through 2020, SEER 17 registries. Differences between stage IV and stage I are denoted by circles; differences between stage IV and stage II are denoted by triangles; and differences between stage IV and stage III are denoted by squares.

Close modal

Further explication of the mixture cure model is provided using gallbladder cancer as an example. In Fig. 3A, the inset panels show the two survival curves for the two populations of patients with stage I cancer, namely, the 50% of individuals in the uncured category who follow a Weibull survival curve, and the 50% of individuals in the cured category who follow a progressively declining long-term cancer-specific survival curve with a low hazard of cause-specific death. A fit of the cancer-specific survival data is accomplished by combining the two curves on each population fraction, yielding the median fitted survival curve (solid line in main panel of Fig. 3A), which closely matches the observed cancer-specific survival data obtained directly from SEER (dots in main panel of Fig. 3A, plotted every 6 months). Fig. 3B shows the fitted two-population model simultaneously for all four stages of gallbladder cancer as an illustrative example.

Figure 3.

Illustration of mixture cure model applied to gallbladder cancer as an example. A, Survival for stage I gallbladder cancer by years since diagnosis. B, Simultaneous fits of the mixed-cure model across all four stages for gallbladder cancer. Dots are observed cancer-specific survival estimates, plotted every 3 months (A) or every 6 months (B). Solid blue line represents median fitted survival aggregating the entire population. Dashed blue line represents long-term survivors in the “cured” population. Insets in panel A show modeled cancer survival for the “not-cured” and “cured” populations separately. Blue shadow in panel B indicates the 95% confidence interval for fitted survival, obtained from the sampled parameters of the Markov chain Monte Carlo fit.

Figure 3.

Illustration of mixture cure model applied to gallbladder cancer as an example. A, Survival for stage I gallbladder cancer by years since diagnosis. B, Simultaneous fits of the mixed-cure model across all four stages for gallbladder cancer. Dots are observed cancer-specific survival estimates, plotted every 3 months (A) or every 6 months (B). Solid blue line represents median fitted survival aggregating the entire population. Dashed blue line represents long-term survivors in the “cured” population. Insets in panel A show modeled cancer survival for the “not-cured” and “cured” populations separately. Blue shadow in panel B indicates the 95% confidence interval for fitted survival, obtained from the sampled parameters of the Markov chain Monte Carlo fit.

Close modal

In secondary analyses, we found that older age at diagnosis had a limited detrimental impact on cure fraction for many cancers, although some exceptions, such as liver/intrahepatic bile duct cancer, exhibited a striking decrease in long-term survival with advancing age (Supplementary Fig. S2; Supplementary Table S1). We also found that the fraction of long-term survivors was greater in the subset of patients who underwent surgery with curative intent (Supplementary Fig. S3; Supplementary Table S2).

Using the concept of statistical cure as a real-world measure of the population burden of cancer, we illustrate that a considerable long-term cure fraction was evident at early stages for each of 21 stageable cancer types and additional subtypes examined in this study, and that these cure fractions were substantially reduced for metastatic cancers. While prior studies of cure fraction have characterized cancer types by survivorship trajectory, for example, as “relatively high survival and cure fraction, relatively short time to cure” or “very low survival and cure fraction, uncertain time to cure” (20), such patterns do not incorporate the influence of stage on cure fraction. Based on our stage-specific cure fraction estimates, even for cancers with poor survival for all stages combined, at least one third of patients (e.g., for pancreatic and liver/intrahepatic bile duct cancers) or more than half (e.g., for esophageal and lung cancers) achieved long-term survival at stage I. Conversely, for some cancers with good survival for all stages combined, fewer than one quarter of stage IV cases were considered statistically cured (e.g., for melanoma and bladder cancer).

We showed that the discrepancies in cure fraction between stage IV and stages I to II were especially pronounced for malignancies with currently recommended and broadly adopted screening strategies in the United States (i.e., breast and colorectal; refs. 29–31). In contrast, we estimated relatively high cure fractions even at stage IV for malignancies for which development of effective screening methods has been problematic (e.g., thyroid and prostate). Several other cancer types that currently lack recommended screening strategies, and would require new screening interventions, also exhibited large gaps between early- and late-stage cure fraction, illustrating the potential for general population screening to enable early detection and cure across multiple cancer types. Such unscreened cancer types collectively account for approximately 70% of cancer deaths annually (7).

No precursor lesions have been convincingly identified for the majority of human malignancies, with few exceptions such as cervix and large bowel. Hence, novel screening tools must currently aim for early detection rather than preventive screening (32). Any screening test needs to accommodate two fundamental prerequisites. First, cancer mortality reduction can be achieved only if the dwell time encompasses a period during which a substantial proportion of cancers progress to a less curable stage. Second, the survival benefit should outweigh harms due to false-positive tests, complications during diagnostic work-up, and overdiagnosis of nonlethal disease. In general, population-based screening programs with broad eligibility criteria, as opposed to personalized screening, will realize the greatest impact on the overall cancer burden. Given that age is by far the strongest determinant of absolute overall cancer risk (33), aged-based eligibility criteria can help to balance the benefits and risks of cancer screening.

We also showed that differences by stage in 5-year cancer-specific survival are highly predictive of differences by stage in long-term cure fraction across cancer types. Therefore, although 5-year cancer-specific survival is relatively short-term, it can serve as a close proxy of long-term survival. The observed stage-related differences in long-term cure fraction suggest that parallel differences between early- and late-stage cancer-specific survival curves are not explained entirely by lead-time bias (i.e., apparently longer survival, without delayed mortality, due to earlier diagnosis). Instead, persistent differences in long-term mortality outcomes—the gold standard for measuring the impact of cancer control strategies—can be presaged by contrasts in shorter-term survival.

Several statistical methods exist for estimating cure fraction (e.g., 10, 24, 25, 34–36). The mixture cure model applied in this study, one of the most commonly used, is supported by long-term follow-up data: most deaths from cancer occur shortly after diagnosis, with a generally decreasing probability of death from cancer thereafter, although mortality dynamics differ by age and cancer type (26). Flexible parametric cure models that do not assume a Weibull distribution for the uncured fraction may better accommodate older age groups, for instance, that have a high excess hazard within a few months after diagnosis (34); however, we excluded the oldest adults from our study because of their poorer-quality data (37), and because they typically are not considered eligible for general population cancer screening.

Strengths of this study include the use of high-quality, population-based, broadly generalizable SEER data; uniform classification of all cases according to the AJCC 6th edition staging criteria; and long-term follow-up of a large cohort for up to 14+ years after diagnosis. Our use of modern statistical methods to fit constrained functions allowed estimation of uncertainty while producing interpretable and statistically stable results, especially in scenarios where few patients died from early-stage cancer. In particular, we recovered and quantified the difference between the acute phase of mortality (when the hazard is changing rapidly) and the late phase, when it is close to constant (and possibly zero). Our model constraints affected the range of the confidence intervals, but generally did not appreciably affect the point estimates of cure fraction.

Our study has some necessary limitations. Cohort or period effects may interfere in this analysis by up-migration of staging, despite the use of a uniform staging system during our chosen years. Secular trends in cancer risk factors (e.g., smoking) may also affect the distribution of cancer subtypes, which can vary in aggressiveness; we accounted for some subtypes of lung and breast cancers, but not other cancer types. Further, for some cancer types, changes in therapy over this time period may render earlier data obsolete. Our study is limited in part by the reliance on death-certificate-based classification of cancer-specific cause of death, which is susceptible to error. Bias may, however, have been mitigated by making comparisons mainly across stages within the same cancer type. Our analysis is also limited by the lack of detailed SEER data on treatment, which could have enabled evaluation of potential long-term effects of different therapeutic regimens by cancer type and stage; and the absence of data on morbidity and quality-of-life endpoints, such as hospitalizations, patient-reported outcomes, employment status, and costs. Early-stage cure fraction estimates may be inflated for certain cancer types (e.g., thyroid, melanoma, uterus, breast, prostate, and kidney) that are prone to overdiagnosis—that is, detection of cancer that would otherwise remain clinically silent. Some of these malignancies can be managed by active surveillance or less aggressive therapeutic regimens. In this analysis, we did not evaluate potential heterogeneity in cure fraction by race/ethnicity or other sociodemographic characteristics, but these factors may be explored in future studies. Finally, additional years of follow-up in the future will enable more robust estimates, especially within subgroups.

In summary, this study uses a mixture cure model for all major cancer types and stages to estimate the maximal potential for early-stage cancer detection to minimize long-term cancer-specific mortality for many clinically diverse cancer types, most of which have no currently recommended screening modalities. By incorporating stage data, our findings augment prior cure fraction studies, which have generally discussed the implications of their results in terms of survivorship care, treatment evaluation, and allocation of healthcare resources. Instead, by calculating early- versus late-stage differences in cure fraction across the spectrum of cancer types, we focus on the theoretical public health impact of cancer screening to substantially increase the proportion of patients cured of cancer through early detection and effective treatment.

E. Hubbell reports personal fees from GRAIL, LLC during the conduct of the study; other support from GRAIL, LLC and Illumina, Inc. outside the submitted work; also has a patent for cancer detection and analysis (multiple patents) pending, a patent for sequencing (multiple patents) issued, and a patent for microarrays (mutiple patents) issued. C.A. Clarke reports personal fees from GRAIL, LLC during the conduct of the study; personal fees from GRAIL LLC outside the submitted work. E.T. Chang reports personal fees from GRAIL, LLC and other support from GRAIL, LLC during the conduct of the study. No disclosures were reported by the other authors.

E. Hubbell: Conceptualization, software, formal analysis, visualization, methodology, writing–review and editing. C.A. Clarke: Conceptualization, supervision, writing–review and editing. K.E. Smedby: Writing–review and editing. H.-O. Adami: Writing–review and editing. E.T. Chang: Conceptualization, data curation, supervision, visualization, writing–original draft.

This work was funded by GRAIL, LLC.

The publication costs of this article were defrayed in part by the payment of publication fees. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

Note: Supplementary data for this article are available at Cancer Epidemiology, Biomarkers & Prevention Online (http://cebp.aacrjournals.org/).

1.
Guzzinati
S
,
Virdone
S
,
De Angelis
R
,
Panato
C
,
Buzzoni
C
,
Capocaccia
R
, et al
.
Characteristics of people living in Italy after a cancer diagnosis in 2010 and projections to 2020
.
BMC Cancer
2018
;
18
:
1
13
.
2.
Hovaldt
HB
,
Suppli
NP
,
Olsen
MH
,
Steding-Jessen
M
,
Hansen
DG
,
Møller
H
, et al
.
Who are the cancer survivors? A nationwide study in Denmark, 1943–2010
.
Br J Cancer
2015
;
112
:
1549
53
.
3.
Maddams
J
,
Utley
M
,
Møller
H
.
Projections of cancer prevalence in the United Kingdom, 2010–2040
.
Br J Cancer
2012
;
107
:
1195
202
.
4.
Miller
KD
,
Nogueira
L
,
Devasia
T
,
Mariotto
AB
,
Yabroff
KR
,
Jemal
A
, et al
.
Cancer treatment and survivorship statistics, 2022
.
CA Cancer J Clin
2022
;
72
:
409
36
.
5.
Radkiewicz
C
,
Järkvik Krönmark
J
,
Adami
H-O
,
Edgren
G
.
Declining cancer incidence in the elderly: decreasing diagnostic intensity or biology?
Cancer Epidemiol Biomarkers Prev
2022
;
31
:
280
6
.
6.
Rahib
L
,
Wehner
MR
,
Matrisian
LM
,
Nead
KT
.
Estimated projection of US cancer incidence and death to 2040
.
JAMA Netw Open
2021
;
4
:
e214708
.
7.
Siegel
RL
,
Miller
KD
,
Wagle
NS
,
Jemal
A
.
Cancer statistics, 2023
.
CA Cancer J Clin
2023
;
73
:
17
48
.
8.
ReFaey
K
,
Tripathi
S
,
Grewal
SS
,
Bhargav
AG
,
Quinones
DJ
,
Chaichana
KL
, et al
.
Cancer mortality rates increasing vs cardiovascular disease mortality decreasing in the world: future implications
.
Mayo Clin Proc Innov Qual Outcomes
2021
;
5
:
645
53
.
9.
Dunn
BK
,
Woloshin
S
,
Xie
H
,
Kramer
BS
.
Cancer overdiagnosis: a challenge in the era of screening
.
J Natl Cancer Cent
2022
;
2
:
235
42
.
10.
De Angelis
R
,
Capocaccia
R
,
Hakulinen
T
,
Soderman
B
,
Verdecchia
A
.
Mixture models for cancer survival analysis: application to population-based data with covariates
.
Stat Med
1999
;
18
:
441
54
.
11.
Hubbard
MO
,
Fu
P
,
Margevicius
S
,
Dowlati
A
,
Linden
PA
.
Five-year survival does not equal cure in non–small cell lung cancer: A Surveillance, Epidemiology, and End Results–based analysis of variables affecting 10- to 18-year survival
.
J Thorac Cardiovasc Surg
2012
;
143
:
1307
13
.
12.
Dood
RL
,
Zhao
Y
,
Armbruster
SD
,
Coleman
RL
,
Tworoger
S
,
Sood
AK
, et al
.
Defining survivorship trajectories across patients with solid tumors: an evidence-based approach
.
JAMA Oncol
2018
;
4
:
1519
26
.
13.
Rueda
OM
,
Sammut
S-J
,
Seoane
JA
,
Chin
S-F
,
Caswell-Jin
JL
,
Callari
M
, et al
.
Dynamics of breast cancer relapse reveal late recurring ER-positive genomic subgroups
.
Nature
2019
;
567
:
399
404
.
14.
Dal Maso
L
,
Panato
C
,
Tavilla
A
,
Guzzinati
S
,
Serraino
D
,
Mallone
S
, et al
.
Cancer cure for 32 cancer types: results from the EUROCARE-5 study
.
Int J Epidemiol
2020
;
49
:
1517
25
.
15.
Cvancarova
M
,
Aagnes
B
,
Fosså
SD
,
Lambert
PC
,
Møller
B
,
Bray
F
.
Proportion cured models applied to 23 cancer sites in Norway
.
Int J Cancer
2013
;
132
:
1700
10
.
16.
Dal Maso
L
,
Guzzinati
S
,
Buzzoni
C
,
Capocaccia
R
,
Serraino
D
,
Caldarella
A
, et al
.
Long-term survival, prevalence, and cure of cancer: a population-based estimation for 818 902 Italian patients and 26 cancer types
.
Ann Oncol
2014
;
25
:
2251
60
.
17.
Dal Maso
L
,
Panato
C
,
Guzzinati
S
,
Serraino
D
,
Francisci
S
,
Botta
L
, et al
.
Prognosis and cure of long-term cancer survivors: a population-based estimation
.
Cancer Med
2019
;
8
:
4497
507
.
18.
Kou
K
,
Dasgupta
P
,
Cramb
SM
,
Yu
XQ
,
Baade
PD
.
Temporal trends in population-level cure of cancer: the australian context
.
Cancer Epidemiol Biomarkers Prev
2020
;
29
:
625
35
.
19.
Romain
G
,
Boussari
O
,
Bossard
N
,
Remontet
L
,
Bouvier
A-M
,
Mounier
M
, et al
.
Time-to-cure and cure proportion in solid cancers in France. A population based study
.
Cancer Epidemiol
2019
;
60
:
93
101
.
20.
Tralongo
P
,
Surbone
A
,
Serraino
D
,
Dal Maso
L
.
Major patterns of cancer cure: clinical implications
.
Eur J Cancer Care (Engl)
2019
;
28
:
e13139
.
21.
Schwartzberg
L
,
Broder
MS
,
Ailawadhi
S
,
Beltran
H
,
Blakely
LJ
,
Budd
GT
, et al
.
Impact of early detection on cancer curability: a modified delphi panel study
.
PLoS One
2022
;
17
:
e0279227
.
22.
Greene
FL
,
Page
DL
,
Fleming
ID
,
Fritz
AG
,
Balch
CM
,
Haller
DG
, et al
.
AJCC Cancer Staging Manual
.
Springer Science & Business Media
;
2013
.
23.
SEER
.
Cause-specific Death Classification - SEER Recodes
.
SEER
.
2020
[
cited November 21, 2023
.]
Available from:
https://seer.cancer.gov/causespecific/index.html.
24.
Phillips
N
,
Coldman
A
,
McBride
ML
.
Estimating cancer prevalence using mixture models for cancer survival
.
Stat Med
2002
;
21
:
1257
70
.
25.
Yu
B
,
Tiwari
RC
,
Cronin
KA
,
Feuer
EJ
.
Cure fraction estimation from the mixture cure models for grouped survival data
.
Stat Med
2004
;
23
:
1733
47
.
26.
Colonna
M
,
Grosclaude
P
,
Bouvier
AM
,
Goungounga
J
,
Jooste
V
.
Health status of prevalent cancer cases as measured by mortality dynamics (cancer vs. noncancer): Application to five major cancers sites
.
Cancer
2022
;
128
:
3663
73
.
27.
Stan Development Team
.
RStan: the R interface to Stan. R package version 2.21.5
. [
cited November 21
,
2023
.]
Available from:
https://mc-stan.org/.
28.
Wickham
H
.
feather: R Bindings to the Feather “API”.
R package version 0.3.5
. https://CRAN.R-project.org/package=feather.
2019
.
29.
Sabatino
SA
,
Thompson
TD
,
White
MC
,
Shapiro
JA
,
de Moor
J
,
VP
D-R
, et al
.
Cancer Screening Test Receipt — United States, 2018
.
MMWR Morb Mortal Wkly Rep
.
2021
[
cited November 21, 2023
.]
Available from:
https://www.cdc.gov/mmwr/volumes/70/wr/mm7002a1.htm.
30.
Fedewa
SA
,
Kazerooni
EA
,
Studts
JL
,
Smith
RA
,
Bandi
P
,
Sauer
AG
, et al
.
State variation in low-dose computed tomography scanning for lung cancer screening in the United States
.
JNCI J Natl Cancer Inst
2021
;
113
:
1044
52
.
31.
US Preventive Services Task Force
.
A and B Recommendations
. [
cited November 21
,
2023
.]
Available from:
https://www.uspreventiveservicestaskforce.org/uspstf/recommendation-topics/uspstf-a-and-b-recommendations.
32.
Kalager
M
,
Adami
H-O
,
Dickman
PW
,
Lagergren
P
,
Steindorf
K
.
Cancer outcome research: a European challenge part II: opportunities and priorities
.
Mol Oncol
2022
;
16
:
2300
11
.
33.
Patel
AV
,
Deubler
E
,
Teras
LR
,
Colditz
GA
,
Lichtman
CJ
,
Cance
WG
, et al
.
Key risk factors for the relative and absolute 5-year risk of cancer to enhance cancer screening and prevention
.
Cancer
2022
;
128
:
3502
15
.
34.
Andersson
TM
,
Dickman
PW
,
Eloranta
S
,
Lambert
PC
.
Estimating and modelling cure in population-based cancer studies within the framework of flexible parametric survival models
.
BMC Med Res Methodol
2011
;
11
:
96
.
35.
Jakobsen
LH
,
Andersson
TM-L
,
Biccler
JL
,
Poulsen
,
Severinsen
MT
,
El-Galaly
TC
, et al
.
On estimating the time to statistical cure
.
BMC Med Res Methodol
2020
;
20
:
71
.
36.
Lambert
PC
,
Thompson
JR
,
Weston
CL
,
Dickman
PW
.
Estimating and modeling the cure fraction in population-based cancer survival analysis
.
Biostatistics
2007
;
8
:
576
94
.
37.
Jdanov
DA
,
Jasilionis
D
,
Soroko
EL
,
Rau
R
,
Vaupel
JW
.
Beyond the Kannisto-Thatcher Database on Old Age Mortality: an assessment of data quality at advanced ages. MPIDR Work Pap
.
Max Planck Institute for Demographic Research
,
Rostock, Germany
;
2008
. [
cited November 21, 2023
.]
Available from:
https://ideas.repec.org/p/dem/wpaper/wp-2008–013.html.
This open access article is distributed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) license.

Supplementary data