Abstract
Spatial modeling of cancer survival is an important tool for identifying geographic disparities and providing an evidence base for resource allocation. Many different approaches have attempted to understand how survival varies geographically. This is the first scoping review to describe different methods and visualization techniques and to assess temporal trends in publications. The review was carried out using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guideline using PubMed and Web of Science databases. Two authors independently screened articles. Articles were eligible for review if they measured cancer survival outcomes in small geographical areas by using spatial regression and/or mapping. Thirty-two articles were included, and the number increased over time. Most articles have been conducted in high-income countries using cancer registry databases. Eight different methods of modeling spatial survival were identified, and there were seven different ways of visualizing the results. Increasing the use of spatial modeling through enhanced data availability and knowledge sharing could help inform and motivate efforts to improve cancer outcomes and reduce excess deaths due to geographical inequalities. Efforts to improve the coverage and completeness of population-based cancer registries should continue to be a priority, in addition to encouraging the open sharing of relevant statistical programming syntax and international collaborations.
Introduction
Cancer is one of the leading causes of morbidity and death worldwide (1) with an estimated 19.3 million new cases diagnosed in 2020 and 10 million deaths (2), thus contributing to a very high economic burden (3). Over 80% of the countries within the World Health Organization (WHO) have publicly available cancer prevention and control strategies, including improving accessibility to early identification and treatment. However, there are continued challenges regarding evidence-based prioritization of areas for resource allocation (4).
Spatial epidemiology describes the distribution of disease indicators by place and may facilitate ecological analyses between these indicators and sociodemographic and other area-level characteristics. It has emerged as an increasingly important tool for informing and assisting resource-sensitive cancer prevention efforts (5, 6) and has been recommended by the WHO to support evidence-based decisions (5, 7).
The development of various sophisticated spatial analysis methods and technologies have provided greater opportunities for small-area disease mapping (6, 8, 9). For example, Lyseen and colleagues highlighted that the number of published articles on spatial patterns of disease incidence or prevalence rates increased between 2000 and 2012 (10). Another review specifically related to spatial analyses of cancer incidence or mortality found an increasing trend of articles published between 1979 and 2015 (11).
Spatial survival analysis uses statistical models to assess the association between time to event data (survival) and space (small geographical units). To date, most cancer atlases have focused on incidence and/or mortality patterns (10, 11). However, survival after cancer diagnosis is an important indicator of the accessibility and effectiveness of diagnostic, treatment, and support services. Hence, understanding small-area variations in cancer survival can provide unique insights into utilization disparities in diagnostic and treatment services.
While previous reviews (10, 11) have included some spatial papers with survival outcomes, to our knowledge, there has been no comprehensive review of the published peer-reviewed literature in this field. Here, we aimed to address this gap by scoping reviewing published articles on the spatial analysis of cancer survival. Given the typically more complex statistical methodology required for survival analyses and the unique considerations when reporting survival estimates, we also provide an overview of the methods and visualization techniques used in these studies to help guide other researchers in this rapidly emerging field.
Materials and Methods
Protocol
The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines were used to conduct this review (12).
Review question
The review question was structured using the Population, Exposure, Comparator and Outcome framework (13) to assess published articles on patterns in survival (Outcome) following a cancer diagnosis (Population) in small geographical areas (Exposure) and to summarize the methods and visualization techniques.
While there was no specific comparator group in this study, spatial analyses inherently compare discrepancies by geographic unit.
Terminology
In this review, spatial survival analysis methods are considered to include mapping, clustering, or regression modeling of small-area data. The scope of visualization techniques in the included articles focused on survival outcome measures, including rates or regression coefficients, which were visualized by small geographical units. The term “articles” refers to published peer-reviewed studies.
Ethics and participants
No ethics approval was required since the data were extracted from previously published articles.
Article searches
A systematic search for all indexed articles was carried out on September 21, 2022 using PubMed and Web of Science electronic bibliographic databases. No filter was applied to the year of publication. Selected Medical Subject Headings and keywords were used as follows: “cancer OR neoplasms” AND “proportional hazards model OR survival analysis OR survival OR excess mortality OR excess death OR mortality/epidemiology OR hazard ratio” AND “geographic mapping OR geographic OR spatial analysis OR spati* OR spatiotemporal analysis OR spatial pattern OR disease map.” The details of the search and number of articles accessed during the search are summarized in Supplementary Materials and Methods (Supplementary Table S1).
Inclusion and exclusion criteria
Articles were included if they met the following four criteria: (i) the study population included individuals diagnosed with cancer, (ii) the outcome measure was survival after cancer diagnosis, (iii) survival outcomes were estimated across small geographical areas, and (iv) quantitative estimates of cancer survival were reported and/or visualized at the small area level.
The scope of the review was limited to peer-reviewed original quantitative articles in English. Reviews, books, reports, editorial letters, commentaries, conference abstracts, and mass-media publications were excluded. The titles of the articles included in the reference sections of reviewed articles identified during the searches were examined for any additional potentially relevant articles.
Study screening
After removing duplicate articles, all articles were initially screened using titles, and those deemed relevant were further screened on the basis of their abstracts. One author (H.M. Bizuyehu) screened full-text articles and those articles with abstracts and titles with insufficient information using the inclusion or exclusion criteria. Reasons for exclusion of articles by full-text review were noted. Another reviewer (P. Dasgupta) independently reviewed the full-text articles. Any differences during the screening process were discussed, and the senior author (P.D. Baade) was consulted when necessary. Automatic screening was not performed in this review.
Data extraction
The data extracted from each of the included articles were year of publication, journal name, total follow-up period (time from diagnosis to study end), geographical unit of analysis (type, number), software used for spatial analysis, statistical methods used for survival analysis, visualization options, country, data source, cancer type, and participants’ age and sex.
Data analysis
The extracted data from the included articles were summarized using numbers and proportions, with the total follow-up period summarized as means. A bar chart was used to present the location of articles by individual countries, WHO regions (Africa, Americas, South-East Asia, Europe, Eastern Mediterranean, and West Pacific) and income categories (low, lower middle, upper middle, and high).
The keywords supplied by the original authors from each article were combined and visualized using the online WordClouds software (https://www.wordclouds.com/). In the software, the most frequently utilized terms are visualized in large and bold font, whereas the least frequently cited words are visualized in smaller and normal font.
Data availability
All data reported in this study were extracted from published studies. All relevant data from the published articles are provided within the article and Supplementary Data.
Results
Article selection
Of the 1,370 articles identified by the systematic database searches, 129 were duplicates (Fig. 1). The remaining articles (1,241) were screened first by title, then abstract, and finally by the full-text. Thirty-two articles reported and/or visualized survival results at small geographical units were included in the final review; these were selected from the 75 full-text articles reviewed. Approximately three-quarters (n = 31) of the 43 excluded articles provided survival results but did not report and/or visualize at small geographical units in those survival estimates. The remaining articles were excluded because they did not report survival results (n = 12).
Articles characteristics
Nearly all (31/32) studies were carried out using administrative data, primarily from cancer registries (n = 28; Table 1), with the same number using unit record (individual-level) data rather than aggregated data sources. Of the cancer registry-based studies (n = 28), nearly all (n = 27) used population-based cancer registries, of which four had publicly available data [i.e., the Surveillance, Epidemiology, and End Results Program (14–17)], while one article used hospital-based cancer registry data.
Article . | Follow-up perioda . | Age group (years)b . | Sex . | Countryc . | Cancer cases (N) . | Data source (aggregate or individual)d . | Software/package (spm, vis, bsv) . | Geographical units (N)e . | Spatial survival model . | Outcome measures visualizations . |
---|---|---|---|---|---|---|---|---|---|---|
Cameron et al., 2022 (33) | 2002–2016 | 15–89 | P | AUS*** | Mesothelioma (N = 7,167) | CRP (Individual) | WinBUGSbsv | Statistical area 2 (N = 2,148) | BSSM | Excess HR |
Cameron et al., 2022 (34) | 2002–2016 | 15–89 | P | AUS*** | MPNs (N = 9,580) | CRP (Individual) | WinBUGSbsv | Statistical area 2 (N = 2,148) | BSSM | Excess HR |
Fan et al., 2021 (19) | 2003–2016 | ≥18 | F | USA*** | Female Breast (N = 937,953) | CRH (Individual) | Rspm, ArcGISvis | Counties (N = 671) | BSSM | 5-year relative survival rate |
Ghazali et al., 2021 (35) | 2008–2013 | Not reported | P | Malaysia*** | Colorectal (N = 4,412) | CRNS (Individual) | Rbsv (e.g., spatsurvviz) | Administrative districts (N = 144) | Spatial survival | Exceedance risk of hazard |
Yee et al., 2021 (36) | 2005–2018 | All age | P | Canada** | Esophageal (N = 10,228) | Administrative data (Individual) | QGISvis | Census tracts | Modeling not reportedf | Median survival rate |
Zhang et al., 2021 (37) | 2002–2012 | ≥30 | P | China* | Lung (N = 3,687) | CRP (Individual) | spatsurv in Rspm, ArcMapviz | Participant's latitude and longitude | Spatial survival | HR, HR < 1 |
Freeman et al., 2020 (38) | 2000–2012 | >18 | P | USA* | Colorectal (N = 27,447) | CRP (Individual) | Not applicable | Census tracts | Modeling not reportedf | Not reported |
Holowatyj et al., 2020 (17) | 1999–2018 | 15–49 | F | USA*** | Colorectal (N = 28,790) | CRP (Individual) | ArcGISviz | Counties (N = 3,108) | Modeling not reportedf | Cause-specific survival rate |
Huse et al., 2020 (15) | 2000–2013 | Mean = 67.9 | P | USA** | PBCLOLBCa (N = 183,484) | CRP (Individual) | R2OpenBUGSspm | Parish boundaries | Accelerated failure time | Not reported |
Rogers et al., 2020 (16) | 1999–2017 | 15–49 | M | USA*** | Colorectal (N = 32,447) | CRP (Individual) | Not applicable | Counties (N = 3,108) | Modeling not reportedf | Not reported |
Villanueva et al., 2020 (39) | 1996–2016 | >18 | F | USA** | Ovarian (N = 29,844) | CRP (Individual) | MapGam in Rbsv | Participant's geocoded location | CPHAM | HR and stratified by stage |
Wang et al., 2020 (40) | 2004–2014 | ≥40 | M | USA** | Prostate (N = 94,274) | CRP (Individual) | R2WinBUGS in Rspm, ArcGISviz | Counties (N = 28) | Accelerated failure time | Spatial frailties probability by quantiles |
Wiese et al., 2020 (41) | 2006–2016 | 21–85 | P | USA** | Colon (N = 3,949) | CRP (Individual) | Rbsv | Census tracts | Bayesian geoadditive | Hazard rate, hazard rate difference |
Carroll et al., 2019 (14) | 1973–2013 | All age | P | USA** | Colorectal (N = 52,124) | CRP (Individual) | Not reported | Counties (N = 99) | Accelerated failure time | Not reported |
Wiese et al., 2019 (21) | 2010–2015 | ≥18 | F | USA** | Female breast (N = 27,078) | CRP (Individual) | Rbsv | Census tracts | Bayesian geoadditive | Hazard rate |
Hesam et al., 2018 (42) | 2002–2017 | Mean = 62.6 | P | Iran* | Gastrointestinal (N = 602) | Administrative data (Individual) | OpenBUGSBSV | Ward (local government) | BSSM | Posterior median frailties |
Cramb et al., 2017 (43) | 1997–2013 | 15–89 | P | AUS** | PBCMLCa (N = 112,000) | CRP (Individual) | WinBUGSspm, MapInfoviz | Statistical area 2 (N = 516) | BSTFPRSM | EHR, changes over time in EHR, Excess risk of death within 5 years |
Vieira et al., 2017 (44) | 1996–2007 | ≥18 | F | USA** | Ovarian (N = 11,765) | CRP (Individual) | MapGam in Rbsv | Participants’ latitude and longitude | CPHAM | HR, lower 95% CI higher 95% CI |
Beyer et al., 2016 (18) | 2002–2011 | Not reported | P | USA* | BCCa (N = 18,697) | CRP (Aggregate) | Rbsv | Post code | Adaptive spatial filtering | 5-year survival rate |
Beyer et al., 2016 (45) | 2002–2011 | ≥18 | F | USA* | Female breast (N = 1,010) | CRP (Individual) | Not reported | Post code | Adaptive spatial filtering | Not reported |
Cramb et al., 2016 (46) | 2001–2011 | <90 | P | AUS** | BCCa (N = 67,860) | CRP (Individual) | WinBUGSspm, MapInfoviz | Statistical local areas (N = 478) | BSTFPRSM | Excess mortality OR |
Freeman et al., 2016 (47) | 2003–2009 | ≥19 | P | USA** | Leukemia (N = 900) | CRP (Individual) | SASviz | Census tracts | Modeling not reportedf | HR |
Bristow et al., 2015 (48) | 1996–2006 | ≥18 | F | USA** | Ovarian (N = 18,199) | CRP (Individual) | MapGAM in Rbsv | Participants’ latitude and longitude | CPHAM | HR |
Chien, et al. 2013 | 1991–2005 | ≥66 | P | USA* | Colorectal (N = 9,038) | CRP (Individual) | Rbsv (e.g., Fields packageviz) | Census tracts (N = 1,641) | Bayesian geosurvival | HR |
Lin et al., 2013 (49) | 1995–2008 | 50–71 | P | USA** | All cancers (N = 17,611) | Survey (Individual) | R or SASspm | Census tracts | CPHAM | Not reported |
Cramb et al., 2012 (50) | 1996–2007 | <90 | P | AUS** | BCCa (N = 51,592) | CRP (Individual) | WinBUGSbsv | Statistical local areas (N = 478) | BSSM | Excess HR |
Wan et al., 2012 (51) | 1995–2010 | All age | P | USA** | Colorectal (N = 56,734) | CRP (Individual) | SaTScanspm | Census tracts | Spatial scan statisticg | Not reported |
Russell et al., 2011 (20) | 1999–2003 | 40–85 | F | USA** | Female breast (N = 15,256) | CRP (Individual) | Not applicable | Metropolitan statistical areas (N = 15) | Modeling not reportedf | Not reported |
Henry et al., 2009 (52) | 1996–2006 | Mean = 69.8 | P | USA** | Colorectal (N = 25,040) | CRP (Individual) | SaTScanspm | Census tracts (N = 1,951) | Spatial scan statisticg | Not reported |
Meliker et al., 2009 (53) | 1985–2002 | Not reported | P | USA** | PBCa (N = 244,833) | Surveillance (Individual) | Space-Time Intelligent Systemviz | Neighborhood boundaries (N = 212) | Modeling not reportedf | 5-year survival rate differences (both absolute and relative) |
Fairley et al., 2008 (54) | 1990–2004 | ≥15 | M | UK* | Prostate (N = 19,408) | CRP (Individual) | WinBUGSbsv | Primary care trust boundaries (N = 44) | BSSM | Spatially smoothed EHR |
Gregorio et al., 2007 (55) | 1984–1998 | ≤65 and ≥75 | M | USA** | Prostate (N = 27,189) | CRP (Individual) | SaTScanspm, Maptitudeviz | Participants’ latitude and longitude | Spatial scan statisticg | Survival rate |
Article . | Follow-up perioda . | Age group (years)b . | Sex . | Countryc . | Cancer cases (N) . | Data source (aggregate or individual)d . | Software/package (spm, vis, bsv) . | Geographical units (N)e . | Spatial survival model . | Outcome measures visualizations . |
---|---|---|---|---|---|---|---|---|---|---|
Cameron et al., 2022 (33) | 2002–2016 | 15–89 | P | AUS*** | Mesothelioma (N = 7,167) | CRP (Individual) | WinBUGSbsv | Statistical area 2 (N = 2,148) | BSSM | Excess HR |
Cameron et al., 2022 (34) | 2002–2016 | 15–89 | P | AUS*** | MPNs (N = 9,580) | CRP (Individual) | WinBUGSbsv | Statistical area 2 (N = 2,148) | BSSM | Excess HR |
Fan et al., 2021 (19) | 2003–2016 | ≥18 | F | USA*** | Female Breast (N = 937,953) | CRH (Individual) | Rspm, ArcGISvis | Counties (N = 671) | BSSM | 5-year relative survival rate |
Ghazali et al., 2021 (35) | 2008–2013 | Not reported | P | Malaysia*** | Colorectal (N = 4,412) | CRNS (Individual) | Rbsv (e.g., spatsurvviz) | Administrative districts (N = 144) | Spatial survival | Exceedance risk of hazard |
Yee et al., 2021 (36) | 2005–2018 | All age | P | Canada** | Esophageal (N = 10,228) | Administrative data (Individual) | QGISvis | Census tracts | Modeling not reportedf | Median survival rate |
Zhang et al., 2021 (37) | 2002–2012 | ≥30 | P | China* | Lung (N = 3,687) | CRP (Individual) | spatsurv in Rspm, ArcMapviz | Participant's latitude and longitude | Spatial survival | HR, HR < 1 |
Freeman et al., 2020 (38) | 2000–2012 | >18 | P | USA* | Colorectal (N = 27,447) | CRP (Individual) | Not applicable | Census tracts | Modeling not reportedf | Not reported |
Holowatyj et al., 2020 (17) | 1999–2018 | 15–49 | F | USA*** | Colorectal (N = 28,790) | CRP (Individual) | ArcGISviz | Counties (N = 3,108) | Modeling not reportedf | Cause-specific survival rate |
Huse et al., 2020 (15) | 2000–2013 | Mean = 67.9 | P | USA** | PBCLOLBCa (N = 183,484) | CRP (Individual) | R2OpenBUGSspm | Parish boundaries | Accelerated failure time | Not reported |
Rogers et al., 2020 (16) | 1999–2017 | 15–49 | M | USA*** | Colorectal (N = 32,447) | CRP (Individual) | Not applicable | Counties (N = 3,108) | Modeling not reportedf | Not reported |
Villanueva et al., 2020 (39) | 1996–2016 | >18 | F | USA** | Ovarian (N = 29,844) | CRP (Individual) | MapGam in Rbsv | Participant's geocoded location | CPHAM | HR and stratified by stage |
Wang et al., 2020 (40) | 2004–2014 | ≥40 | M | USA** | Prostate (N = 94,274) | CRP (Individual) | R2WinBUGS in Rspm, ArcGISviz | Counties (N = 28) | Accelerated failure time | Spatial frailties probability by quantiles |
Wiese et al., 2020 (41) | 2006–2016 | 21–85 | P | USA** | Colon (N = 3,949) | CRP (Individual) | Rbsv | Census tracts | Bayesian geoadditive | Hazard rate, hazard rate difference |
Carroll et al., 2019 (14) | 1973–2013 | All age | P | USA** | Colorectal (N = 52,124) | CRP (Individual) | Not reported | Counties (N = 99) | Accelerated failure time | Not reported |
Wiese et al., 2019 (21) | 2010–2015 | ≥18 | F | USA** | Female breast (N = 27,078) | CRP (Individual) | Rbsv | Census tracts | Bayesian geoadditive | Hazard rate |
Hesam et al., 2018 (42) | 2002–2017 | Mean = 62.6 | P | Iran* | Gastrointestinal (N = 602) | Administrative data (Individual) | OpenBUGSBSV | Ward (local government) | BSSM | Posterior median frailties |
Cramb et al., 2017 (43) | 1997–2013 | 15–89 | P | AUS** | PBCMLCa (N = 112,000) | CRP (Individual) | WinBUGSspm, MapInfoviz | Statistical area 2 (N = 516) | BSTFPRSM | EHR, changes over time in EHR, Excess risk of death within 5 years |
Vieira et al., 2017 (44) | 1996–2007 | ≥18 | F | USA** | Ovarian (N = 11,765) | CRP (Individual) | MapGam in Rbsv | Participants’ latitude and longitude | CPHAM | HR, lower 95% CI higher 95% CI |
Beyer et al., 2016 (18) | 2002–2011 | Not reported | P | USA* | BCCa (N = 18,697) | CRP (Aggregate) | Rbsv | Post code | Adaptive spatial filtering | 5-year survival rate |
Beyer et al., 2016 (45) | 2002–2011 | ≥18 | F | USA* | Female breast (N = 1,010) | CRP (Individual) | Not reported | Post code | Adaptive spatial filtering | Not reported |
Cramb et al., 2016 (46) | 2001–2011 | <90 | P | AUS** | BCCa (N = 67,860) | CRP (Individual) | WinBUGSspm, MapInfoviz | Statistical local areas (N = 478) | BSTFPRSM | Excess mortality OR |
Freeman et al., 2016 (47) | 2003–2009 | ≥19 | P | USA** | Leukemia (N = 900) | CRP (Individual) | SASviz | Census tracts | Modeling not reportedf | HR |
Bristow et al., 2015 (48) | 1996–2006 | ≥18 | F | USA** | Ovarian (N = 18,199) | CRP (Individual) | MapGAM in Rbsv | Participants’ latitude and longitude | CPHAM | HR |
Chien, et al. 2013 | 1991–2005 | ≥66 | P | USA* | Colorectal (N = 9,038) | CRP (Individual) | Rbsv (e.g., Fields packageviz) | Census tracts (N = 1,641) | Bayesian geosurvival | HR |
Lin et al., 2013 (49) | 1995–2008 | 50–71 | P | USA** | All cancers (N = 17,611) | Survey (Individual) | R or SASspm | Census tracts | CPHAM | Not reported |
Cramb et al., 2012 (50) | 1996–2007 | <90 | P | AUS** | BCCa (N = 51,592) | CRP (Individual) | WinBUGSbsv | Statistical local areas (N = 478) | BSSM | Excess HR |
Wan et al., 2012 (51) | 1995–2010 | All age | P | USA** | Colorectal (N = 56,734) | CRP (Individual) | SaTScanspm | Census tracts | Spatial scan statisticg | Not reported |
Russell et al., 2011 (20) | 1999–2003 | 40–85 | F | USA** | Female breast (N = 15,256) | CRP (Individual) | Not applicable | Metropolitan statistical areas (N = 15) | Modeling not reportedf | Not reported |
Henry et al., 2009 (52) | 1996–2006 | Mean = 69.8 | P | USA** | Colorectal (N = 25,040) | CRP (Individual) | SaTScanspm | Census tracts (N = 1,951) | Spatial scan statisticg | Not reported |
Meliker et al., 2009 (53) | 1985–2002 | Not reported | P | USA** | PBCa (N = 244,833) | Surveillance (Individual) | Space-Time Intelligent Systemviz | Neighborhood boundaries (N = 212) | Modeling not reportedf | 5-year survival rate differences (both absolute and relative) |
Fairley et al., 2008 (54) | 1990–2004 | ≥15 | M | UK* | Prostate (N = 19,408) | CRP (Individual) | WinBUGSbsv | Primary care trust boundaries (N = 44) | BSSM | Spatially smoothed EHR |
Gregorio et al., 2007 (55) | 1984–1998 | ≤65 and ≥75 | M | USA** | Prostate (N = 27,189) | CRP (Individual) | SaTScanspm, Maptitudeviz | Participants’ latitude and longitude | Spatial scan statisticg | Survival rate |
Note: spm, vis, bsv refers the software/package used for spatial survival modeling and/or visualizations where the software/packages regrouped as used for spatial modeling(spm), visualization (viz), both spatial modeling and visualization(bsv) and not applicable (if no spatial modeling and visualization).
Abbreviations: AUS, Australia; BSSM, Bayesian spatial survival model; BSTFPRSM, Bayesian spatiotemporal flexible parametric relative survival model; CRH, Cancer registry (hospital based); CRNS, Cancer registry but not sure whether it is population or hospital based; CRH, Cancer registry (population based); CPHAM, Cox proportional hazards additive model; F, Females, BCCa, Female breast and colorectal cancers; PBCa, prostate and female breast cancers; PBCLOLBCa, prostate, female breast, colorectal, leukemia, ovarian, lung and bronchus cancers; PBCMLCa, prostate, female breast, colorectal, melanoma and lung cancers; M, males; MPN, myeloproliferative neoplasms; P, persons; UK, United Kingdom; USA, United States of America.
*aTotal follow-up period (time from diagnosis to end of the study).
bMean age was used if it was reported, and no other information was available regarding the included participants’ age.
cThe stars in each country represent the geographical coverage of the study: ****(national level study), ** [administrative units (states, provinces, regional administrative units) within the country], *(subadministrative units, i.e., some part of states, provinces, or regional administrative units).
d(Aggregated) refers to aggregated data and (Individual) refers to individual unit record data when the original authors of the article accessed the data.
eThe number of geographical units only reported if they were available in articles.
fSurvival outcome measures are reported at small geographical unit, but the specific spatial survival analysis type is not reported.
gSpatial scan statistic for survival data with exponential distribution.
More than three-quarters (n = 25) of the articles examined only one specific cancer type, and the rest included multiple cancer types, or all cancers combined. The most frequently assessed individual cancer types were colorectal (n = 9), female breast (n = 4), ovarian (n = 3), and prostate cancer (n = 3). In more than half (n = 17) of the articles, the study cohort ranged between 10,000 and 60,000 patients with cancer, with nine and six articles having fewer or greater than this range, respectively (Table 1).
Approximately two-thirds of the included articles (n = 20) used data for both sexes combined (persons), while the remainder used data only for males (n = 4) or females (n = 8; Table 1). Twenty-two articles reported the specific age groups of the included participants, and all participants were adults ages 15 years and older. About 70% of the articles studied areas of Americas WHO region, and 90% were from high-income countries (Fig. 2A): the United States (n = 22), Australia (n = 5), Canada (n = 1), and the United Kingdom (n = 1), and the rest (n = 3) were from middle-high-income countries, China, Iran, and Malaysia (Fig. 2B). Four-fifths of studies (n = 26) were conducted at the subnational level, being either one or more administrative units (states, provinces, regional administrative units) within the country (n = 21) or some locations within the administrative units (n = 5; Table 1).
All articles reported the geographical units used for the spatial analysis, with the commonly used units being census tracts (n = 9), counties (n = 5), and participants’ latitude and longitude (n = 4). Of the 14 articles that reported the number of geographical units, over half (n = 8) used 400 units or more (Table 1).
The top five most common keywords, as provided by the original authors of the articles, were survival, cancer, colorectal cancer, geography, spatial model, and relative survival. Spatial survival, spatiotemporal, and accelerated failure time models were also frequently used keywords (Supplementary Fig. S1). Of articles that visualized and/or analyzed spatial survival outcome measures (n = 28), 26 reported the software/packages used for their data analysis, with R (n = 13), WinBUGS (n = 5), ArcGIS/ArcMap (n = 4), and SaTScan (n = 3) being among the most frequently used packages (Table 1).
Temporal trend of publications with spatial survival methods
The number of publications using spatial survival analysis increased over time (Fig. 3). The earliest publication was in 2007, with a higher number of publications from 2016 (n = 4) and onward, with the highest number reported in 2020 (n = 7). In 2022, two articles were accessed, and approximately four articles were projected to be published by the end of the year (Fig. 3). The mean total follow-up period (time from diagnosis to the end of the study) was 14 years (range, 5–41 years). Approximately four-fifths (n = 26) of the articles had a follow-up period of 10–20 years (Supplementary Fig. S2).
Spatial survival modeling options and visualization techniques
Nearly four-fifths (n = 25) of the articles used spatial survival modeling methods. The remaining mapped survival estimates in small geographical areas or estimated and mapped survival based on spatial proximity to cancer care services. Eight different spatial survival modeling methods were reported, with Bayesian spatial survival models being the most frequently employed (n = 7), followed by Cox proportional hazards additive models (n = 4), and accelerated failure time models (n = 3). Smoothing techniques for eight of nine spatial survival modeling methods were reported. Model-based smoothing was the most frequently used method (n = 15), followed by expanding windows (n = 5) and penalized splines (n = 2). Model-based smoothing methods assume that the spatial effect for each area is the average of their neighbors’ spatial effects. Particularly, the smoothing techniques reported in this review included: intrinsic conditional autoregressive priors (n = 7), locally weighted regression (LOWESS; n = 4), Leroux priors (n = 2), and ordinary kriging (n = 2). Most of the articles that used Bayesian spatial survival models smoothed estimates using intrinsic conditional autoregressive priors (n = 5), which assumes that the effects of areas are correlated with neighboring areas’ effects, while the remaining articles (n = 2) used the Leroux model, which assumes that the effects of areas have both a structured and unstructured component.
Over the most recent 5 years (2018–2022), published articles used five different spatial survival modeling methods, whereas the three remaining methods (adaptive spatial filtering, Bayesian spatiotemporal flexible parametric relative survival model, and spatial scan statistic for survival data) were used in articles published prior to 2018 (see Table 2 for descriptions of the models). Three methods—spatial survival models, Bayesian geoadditive models, and accelerated failure time models—were used in published articles within the last 3 years only (2019–2022). Bayesian spatial survival models have been used in articles published since 2008, half of which were published in the past 2 years (2021–2022).
Model type . | Description . | Number of articles using the model . |
---|---|---|
Bayesian spatial survival model | Bayesian smoothing methods have been used to estimate excess HRs after adjusting for risk factors such as age. Excess HRs describe the risk of death in excess of population mortality for each small area compared with the whole study region. Hazards are assumed to be constant over the study follow-up period. Smoothing for neighboring areas can be carried out with the intrinsic conditional autoregressive or Leroux priors. Markov chain Monte Carlo simulations are employed to estimate model parameters and area-level effects (50, 54). | 7 (19, 33, 34, 42, 50, 54, 56) |
Cox proportional hazards additive models | This model can be used to obtain HRs for each area unit relative to the whole study area's average hazard. A locally weighted regression (LOWESS) smoother is applied to the Cox proportional hazards model and study participants’ longitude and latitude while adapting to local population density. Akaike's information criterion can be used to determine the optimal smoothness via a span parameter. For example, a span size of 0.3 means that each area unit expands until 30% of its data is from the nearest areas (39, 48). | 4 (39, 44, 48, 49) |
Accelerated failure time model | The model is usually fully parametric and is used to assess survival predictors, including geographical area (represented by spatial correlations). The model has flexibility to assess the relationship between log survival time and fixed effects (covariates) and random effects (spatial or spatiotemporal). Survival time is estimated using the accelerated failure time assumptions, that hazard increased or decreased by some constant value over time (14, 15, 40). | 3 (14, 15, 40) |
Spatial scan statistic for survival data | This method estimates survival on a continuous scale using the exponential function and explores associations with areas. Survival is estimated using the exponential probability distribution by comparing the mean survival inside the designated area (a circular scanning window that could contain as few as two cases up to as many as 50% of total cases) with mean survival outside the designated area. The cut-off point for the window size is decided by the investigators based on their interests and factors such as number of populations per cluster or geographical size. Investigators could also use preexisting window size. Maximum likelihood estimation is used to test the variation of survival in each circular scanning window where the null hypothesis is equal mean survival between inside and outside scanning windows. Monte Carlo simulations are used for estimating the P values for the maximum likelihood estimation. The analysis can be carried out using SaTScan software (55). | 3 (51, 52, 55) |
Bayesian spatiotemporal flexible parametric relative survival model | This model allows assessment of the relationship between relative survival and spatial random effects and complex interaction terms (such as between space and time). Smoothed survival functions can also be predicted. Smoothing for neighboring areas can be carried out using the intrinsic conditional autoregressive distribution. Markov chain Monte Carlo method can be used for the analysis (43, 46). | 2 (43, 46) |
Bayesian geoadditive model | This model is used for assessing the relationship of survival with covariates, including nonlinear continuous and/or time varying factors, spatial random effects, and fixed effects. It allows estimation of the adjusted hazard rate by controlling individual and area level factors. The hazard rate of each area is smoothed using penalized splines (P-splines). The analysis could be carried out using Markov chain Monte Carlo simulation (21). | 2 (21, 41) |
Adaptive spatial filtering | This method is used to obtain a continuous surface of cancer survival rates. A circular zone (“spatial filter”) is placed over each area unit in the study region. The radius of each circular zone is then expanded by aggregating the nearest area units until a predetermined minimum number of cancer cases are reached. When this threshold has been attained for each zone, there will be sufficient data within the zones so that estimates of survival rates will have stabilized. The size of circular zones ranges from 12 to 36 miles (18, 45). | 2 (18, 45) |
Spatial Survival Model | This model is used for adding the Bayesian spatial correlation terms to the parametric proportional hazards model where the baseline hazard follows Weibull distribution. The smoothing could be conducted using ordinary kriging, which accounts for neighboring values and spatial autocorrelations. The analysis could be conducted using Markov chain Monte Carlo simulation in R software via “spatsurv” package (37). | 2 (35, 37) |
Model type . | Description . | Number of articles using the model . |
---|---|---|
Bayesian spatial survival model | Bayesian smoothing methods have been used to estimate excess HRs after adjusting for risk factors such as age. Excess HRs describe the risk of death in excess of population mortality for each small area compared with the whole study region. Hazards are assumed to be constant over the study follow-up period. Smoothing for neighboring areas can be carried out with the intrinsic conditional autoregressive or Leroux priors. Markov chain Monte Carlo simulations are employed to estimate model parameters and area-level effects (50, 54). | 7 (19, 33, 34, 42, 50, 54, 56) |
Cox proportional hazards additive models | This model can be used to obtain HRs for each area unit relative to the whole study area's average hazard. A locally weighted regression (LOWESS) smoother is applied to the Cox proportional hazards model and study participants’ longitude and latitude while adapting to local population density. Akaike's information criterion can be used to determine the optimal smoothness via a span parameter. For example, a span size of 0.3 means that each area unit expands until 30% of its data is from the nearest areas (39, 48). | 4 (39, 44, 48, 49) |
Accelerated failure time model | The model is usually fully parametric and is used to assess survival predictors, including geographical area (represented by spatial correlations). The model has flexibility to assess the relationship between log survival time and fixed effects (covariates) and random effects (spatial or spatiotemporal). Survival time is estimated using the accelerated failure time assumptions, that hazard increased or decreased by some constant value over time (14, 15, 40). | 3 (14, 15, 40) |
Spatial scan statistic for survival data | This method estimates survival on a continuous scale using the exponential function and explores associations with areas. Survival is estimated using the exponential probability distribution by comparing the mean survival inside the designated area (a circular scanning window that could contain as few as two cases up to as many as 50% of total cases) with mean survival outside the designated area. The cut-off point for the window size is decided by the investigators based on their interests and factors such as number of populations per cluster or geographical size. Investigators could also use preexisting window size. Maximum likelihood estimation is used to test the variation of survival in each circular scanning window where the null hypothesis is equal mean survival between inside and outside scanning windows. Monte Carlo simulations are used for estimating the P values for the maximum likelihood estimation. The analysis can be carried out using SaTScan software (55). | 3 (51, 52, 55) |
Bayesian spatiotemporal flexible parametric relative survival model | This model allows assessment of the relationship between relative survival and spatial random effects and complex interaction terms (such as between space and time). Smoothed survival functions can also be predicted. Smoothing for neighboring areas can be carried out using the intrinsic conditional autoregressive distribution. Markov chain Monte Carlo method can be used for the analysis (43, 46). | 2 (43, 46) |
Bayesian geoadditive model | This model is used for assessing the relationship of survival with covariates, including nonlinear continuous and/or time varying factors, spatial random effects, and fixed effects. It allows estimation of the adjusted hazard rate by controlling individual and area level factors. The hazard rate of each area is smoothed using penalized splines (P-splines). The analysis could be carried out using Markov chain Monte Carlo simulation (21). | 2 (21, 41) |
Adaptive spatial filtering | This method is used to obtain a continuous surface of cancer survival rates. A circular zone (“spatial filter”) is placed over each area unit in the study region. The radius of each circular zone is then expanded by aggregating the nearest area units until a predetermined minimum number of cancer cases are reached. When this threshold has been attained for each zone, there will be sufficient data within the zones so that estimates of survival rates will have stabilized. The size of circular zones ranges from 12 to 36 miles (18, 45). | 2 (18, 45) |
Spatial Survival Model | This model is used for adding the Bayesian spatial correlation terms to the parametric proportional hazards model where the baseline hazard follows Weibull distribution. The smoothing could be conducted using ordinary kriging, which accounts for neighboring values and spatial autocorrelations. The analysis could be conducted using Markov chain Monte Carlo simulation in R software via “spatsurv” package (37). | 2 (35, 37) |
Survival measures were visualized in 23 articles using HRs (n = 8), excess HRs (n = 5), 5-year survival rate (n = 3), median survival time (n = 3), spatial frailty (n = 2), excess OR (n = 1), and probability of exceedance for hazard (n = 1; Table 1). Eight of the remaining nine articles additionally visualized nonsurvival measures such as the distribution of health facilities and population density. One article did not visualize any spatial outcomes.
Discussion
This review has highlighted, for the first time, the increasing utilization of spatial methods to describe disparities in cancer survival and the wide variety of available spatial survival modeling methods and visualization techniques. Several of these published articles have reported substantial geographic variation in cancer survival (18–21), suggesting inequities in diagnosis and/or treatment strategies. These types of spatial survival studies provide quantitative evidence that is often required to motivate efforts to understand why inequalities exist, and then ultimately to intervene to reduce these inequalities.
Although the use of spatial survival methods has increased over time, the application of these spatial methods to date has been restricted to only a few countries, particularly those with established cancer registries and favorable data release policies. The wide variety of spatial survival models and methods of visualizing the results provides many different and informative ways of interpreting the data, yet this tends to make comparisons across studies difficult.
The increasing number of articles over time could be related to the increasing number of cancer registries, with data having the required population coverage and geographical precision, as well as methodologic developments and technological advancements. Population cancer registries were the main data sources for the included articles, and their complete population coverage of diagnoses and subsequent follow-up is required for the accurate measurement of spatial variation in cancer survival (22). A recent study reported that a quarter (26%) of all countries worldwide do not have any cancer registry, with about an additional 32% having a cancer registry but without national coverage (i.e., limited to subnational or hospitals; ref. 23). However, the number and coverage of cancer registries worldwide has increased substantially over the past 50 years, and with the recent opening of Regional Hubs in Asia, Africa, the Caribbean, Latin America, and the Pacific Islands, it is likely that this expansion will continue (22). Hence, these methods have increasing potential for application to a broader range of populations.
Another reason for the limited application of spatial survival methods in those countries with population-based cancer registries could be the barriers regarding establishing and effectively maintaining population-based cancer registries, including linkage with complete population level mortality information, along with policies enabling the statistical data to be accessed and analyzed at the required geographical precision while ensuring confidentiality of the original record (24–26). In addition, researchers need to have the necessary statistical capability and computational capacity for the spatial survival models (27, 28). While the expansion of population-based cancer registries internationally is encouraging (22), the more widespread use of these spatial survival models internationally will be conditional on these barriers being removed.
A large variety of spatial survival methods and visualization options have been reported, with eight different spatial survival modeling methods being identified in this review. This highlights the range of statistical methods available for these types of analyses, and specific methods are often selected on the basis of the characteristics of the data being analyzed or the outcome measures required. However, these different methods do limit the ability to directly compare published spatial patterns between countries.
In addition, the detailed geographical information within these study data means that, even when deidentified, it is typically not possible to get ethics or data custodian approval to enter into a research collaboration. Further development of the increasingly popular federated learning methods (29) could potentially make the application of these spatial survival models across different research groups internationally possible without requiring the original data to be shared. In addition, platforms such as GitHub, along with synthetically generated example datasets or online technical reports such as that released through the Australian Cancer Atlas project (30), provide an excellent opportunity to share expertise and ensure that the same approaches can be used between research groups.
Because the correct interpretation spatial survival maps can be complex, it is important to increase the training and collaboration opportunities available for researchers and policy makers. Studies in China, Bangladesh, and the United Kingdom have highlighted a number of barriers to understanding spatial analysis results and applications in the decision-making process, including technical limitations, political complexities, and cultural sensitivities (29, 31, 32).
Strengths and limitations
This is the first review to synthesize published articles on the spatial analysis of cancer survival. The review was conducted according to PRISMA guidelines with multiple databases that were searched with complex queries; however, it is possible that the search terms and inclusion criteria used could have unintentionally led to the exclusion of relevant articles. This may include relevant peer-reviewed articles published in languages other than English. In addition, reports in grey literature were not included as they are typically not peer-reviewed.
Conclusion
This review identified an increasing trend in publications on spatial survival analyses in cancer research. It highlighted a large variety of spatial survival modeling options and visualization techniques, each providing a different approach for reporting and interpreting the data. Although more than 25 years have passed because the WHO recommended using small-area estimated disease mapping in decision-making processes (5, 7), the current application for cancer survival outcomes is mostly limited to high-income countries. A greater worldwide uptake of these methods could be facilitated through enhanced cancer registry data, knowledge sharing, and collaboration between research groups.
Authors' Disclosures
J.K. Cameron reports grants from Australian Research Council during the conduct of the study; grants from National Health and Medical Research Council outside the submitted work. No disclosures were reported by the other authors.
Acknowledgments
No external funding was received for this study.
We appreciate Dr. James Retell for providing assistance in visual design.
The publication costs of this article were defrayed in part by the payment of publication fees. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Note: Supplementary data for this article are available at Cancer Epidemiology, Biomarkers & Prevention Online (http://cebp.aacrjournals.org/).