Abstract
Colorectal cancer screening reduces colorectal cancer incidence and mortality. Risk models based on phenotypic variables have relatively good discrimination in external validation and may improve efficiency of screening. Models incorporating genetic variables may perform better. In this review, we updated our previous review by searching Medline and EMBASE from the end date of that review (January 2014) to February 2019 to identify models incorporating at least one SNP and applicable to asymptomatic individuals in the general population. We identified 23 new models, giving a total of 29. Of those in which the SNP selection was on the basis of published genome-wide association studies, in external or split-sample validation the AUROC was 0.56 to 0.57 for models that included SNPs alone, 0.61 to 0.63 for SNPs in combination with other risk factors, and 0.56 to 0.70 when age was included. Calibration was only reported for four. The addition of SNPs to other risk factors increases discrimination by 0.01 to 0.06. Public health modeling studies suggest that, if determined by risk models, the range of starting ages for screening would be several years greater than using family history alone. Further validation and calibration studies are needed alongside modeling studies to assess the population-level impact of introducing genetic risk–based screening programs.
Introduction
Colorectal cancer is the second leading cause of cancer-related death in Europe and the United States (1). There is good evidence that screening adults in the general population who are at average risk with fecal occult blood testing (FOBT), flexible sigmoidoscopy, or colonoscopy reduces colorectal cancer incidence and mortality (2–7). However, as with all screening programs, colorectal cancer screening has the potential to cause harm, both directly to those screened and indirectly through diversion of resources away from other services. Targeted or stratified screening could potentially provide a way of reducing complication rates and demand on services while still ensuring those at greatest risk are effectively screened. For example, the U.S. Multi-Society Task Force on Colorectal Cancer endorse a risk-stratified approach with fecal immunochemical testing screening in populations with an estimated low prevalence of advanced neoplasia and colonoscopy screening in high prevalence populations (8).
We have previously published a systematic review of risk prediction models for colorectal cancer and identified 40 models that have been developed and could potentially be used for risk stratification (9). These range from models including only data routinely available from electronic health records, such as age, sex, and body mass index (BMI), to more complex models containing detailed information about lifestyle factors and genetic information. Using the UK Biobank cohort for external validation we have shown that several of those including only phenotypic risk factors and/or family history exhibit reasonable discrimination in a UK population (10). At the time of the literature search for that review (January 2014) only six risk models incorporating genetic risk factors and predicting future risk of developing colorectal cancer had been published. Their performance was similar to models including only phenotypic information. Since then, findings from genome-wide association studies (GWAS) have resulted in a rapid rise in the number of published risk models incorporating genetic information. Simulation studies have also shown that using genetic information to stratify screening has the potential to improve efficiency by reducing the number of individuals screened while still detecting as many cases (11, 12). It is not clear, however, which genetic risk models perform best, how much combining common genetic variants with phenotypic risk factors improves model performance, or the potential public health impact of incorporating these models into screening programs.
To inform future stratification of colorectal cancer screening using genetic data, we have updated our previous systematic review to identify and synthesize the performance of all published colorectal cancer prediction risk models that include common genetic variants and estimates of the potential public health impact of stratifying populations for screening based on genetic risk.
Materials and Methods
We updated a previous systematic review following a published study protocol (PROSPERO 2018 CRD42018089654 available from: http://www.crd.york.ac.uk/PROSPERO/display_record.php?ID=CRD42018089654).
Search strategy
We searched Medline, EMBASE, and the Cochrane Library from January 2014 (the end date of the search in our previous review) to February 2019 applying the same search strategy used in our previous review, with no language limits (see Supplementary Materials and Methods S1, for complete search strategy for Medline and EMBASE). We subsequently manually screened the reference lists of all included papers.
Study selection
We included studies if they met all of the following criteria: (i) were published as a primary research paper in a peer-reviewed journal; (ii) provided a measure of relative or absolute risk using a combination of two or more risk factors, including at least one SNP, that allows identification of individuals at higher risk of colon, rectal or colorectal cancer, or advanced colorectal neoplasia; (iii) reported a measure of discrimination (e.g., C-statistic, AUROC), or calibration (e.g., Hosmer–Lemeshow statistic, observed/expected ratio), or a quantitative estimate of the implications of using the risk model for stratified screening; and (iv) included data applicable to the general population (i.e., the risk model was not specifically designed for individuals known to carry specific high-risk mutations or from families with a known cancer syndrome, such as familial adenomatous polyposis or hereditary nonpolyposis colorectal cancer). As in our previous review, studies including only highly selected groups, for example immunosuppressed patients, organ transplant recipients, or those with a previous history of colon and/or rectal cancer were excluded. We also included studies published prior to January 2014 that had been identified in our previous review if they met the above criteria.
One reviewer (L. McGeoch) performed the search and screened 67% of the titles and abstracts to exclude papers that were clearly not relevant. The remaining 33% of titles and abstracts were divided between four reviewers (J.A. Usher-Smith, S.J. Griffin, J.D. Emery, and F.M. Walter) for screening. The four reviewers also each independently assessed a random selection of 3% of the papers screened by L. McGoech. The full text of all papers for which a definite decision to reject could not be made from the title and abstract alone were independently assessed by two reviewers (L. McGeoch and J.A. Usher-Smith/S.J. Griffin/J.D. Emery/F.M. Walter). Those assessed as not meeting the inclusion criteria by both researchers were excluded. Those for which it was not clear were discussed with the wider research team. One paper was translated into English for assessment and subsequent data extraction.
Data extraction and synthesis
Data were extracted independently by two researchers (L. McGoech and J.A. Usher-Smith/S.J. Griffin/J.D. Emery) directly into data tables to minimize bias. These tables included details on: (i) the development of the model, including potential sources of bias such as the selection processes for participants and SNPs; (ii) the risk model itself, including the variables included; (iii) the methods of model development (genetic and phenotypic components); (iv) the performance measures [discrimination (e.g., C-statistic, AUROC), or calibration (e.g., Hosmer–Lemeshow statistic, observed/expected ratio)] of the risk model in the development population; (v) any external validation studies of the risk model, including the study design and performance of the risk model; and (vi) any public health modeling of the potential impact of using the risk models in practice. In articles that reported performance data for multiple step-wise models developed in the same population, we included only the best performing model in our main analysis. If performance data were presented separately for a model including only SNPs and a model including both SNPs and phenotypic variables in the same article, these were considered as two models. If performance data were presented separately for models that incorporated the same SNPs but were developed using unweighted allele counting or with allele weights derived either from the literature or the study population, we extracted both sets of data. To assess the incremental effect on performance of incorporating SNPs into the risk models, we additionally extracted data on the performance of the models including only phenotypic risk factors and/or family history, where they were reported.
At the same time as data extraction, an overall assessment of risk of bias was performed using four domains from the CHARMS checklist (study population, predictors, outcome and sample size, and missing data; ref. 13). We also classified studies into the following groups according to the TRIPOD guidelines (14):
(i) development only (1a);
(ii) development and validation using resampling (1b);
(iii) random- (2a) or nonrandom- (2b) split sample development and validation;
(iv) development and validation using separate data (3); or
(v) validation only (4).
For the models including only SNPs, a model developed using SNPs selected from the literature, either with unweighted allele counting, or with allele weights derived from the literature, was considered as group 3 (development and validation using separate data). However, if the model used weights derived from the study population, or if the model included only the SNPs found to be significantly associated with colorectal cancer in the study population, we assigned it to either group 1b, 2a, 2b, or 3, depending on the relationship between the study population and the testing population. Simulated populations were considered external populations.
Results
From 12,394 articles we excluded 12,277 at title and abstract level and a further 103 after full text assessment. After title and abstract screening by the first reviewer, no additional papers met the inclusion criteria in the random 12% screened by a second reviewer. There was also complete agreement among researchers at the full text level with the most common reasons for exclusion being that the papers did not include a risk score (n = 43), were conference abstracts (n = 19), or did not include any performance measures (n = 23; Supplementary Fig. S1). Four were also excluded as they described models that were developed to detect prevalent undiagnosed disease rather than estimate future incident disease risk.
Four more articles were identified through citation searching. The addition of four articles (six risk models) that had been included in our previous systematic review gave a total of 22 articles describing 29 risk models for inclusion in the analysis. Table 1 summarizes these 29 risk models. Except for the model by Weigl and colleagues (15) that included colorectal cancer or advanced adenoma as the outcome, all had colorectal cancer as the outcome. The paper by Jung and colleagues (16) developed separate models for colorectal, colon, and rectal cancer. As these were the only models for colon and rectal cancer, we included only the model for colorectal cancer in the analysis. Nine models included only SNPs, six included SNPs plus phenotypic factors but not age, and 14 a combination of SNPs, phenotypic factors, and age. The number of SNPs included in the models ranged from 3 to 95.
Author, year . | Country . | Outcome . | Factors included in score . | Selection of SNPs . | Method of development of GRS . | Selection of phenotypic factors . | Method of development of combined model . | TRIPOD levela . |
---|---|---|---|---|---|---|---|---|
Genetic risk factors alone | ||||||||
Dunlop 2013a (26) | UK, Canada, Australia, USA, and Germany (d) | CRC | 10 SNPs | Published GWAS studies from European populations | Unweighted allele counting model | — | — | 3 |
Sweden and Finland (v) | ||||||||
Frampton 2016 (12) | UK (v) | CRC | 37 SNPs | Published GWAS studies from European populations | Weighted allele model weighted by published log odds | — | — | 3 |
Hosono 2016a (47) | Japan (d, v) | CRC | 6 SNPs | Published GWAS studies from European populations followed by logistic regression | Unweighted allele counting model | — | — | 2b |
Huyghe 2019 (34) | European (91.7%) and East Asian (8.3%; d) | CRC | 95 SNPs | GWAS study | Weighted allele model weighted by study derived weights | — | — | 1a |
Ibanez-Sanz 2017a (21) | Spain (d, v) | CRC | 21 SNPs | Published GWAS studies included within European Bioinformatics Institute | Unweighted allele counting model (weighted allele models weighted by published log odds and study-derived log odds similar so not reported) | — | — | 3 |
Jenkins 2016 (46) | Australia, Canada, and USA (v) | CRC | 45 SNPs | Published GWAS studies from European populations | Weighted allele model weighted by published log odds | — | — | 3b, 4b |
Smith 2018a (23) | UK (d, v) | CRC | 41 SNPs | Published GWAS studies from predominantly European and white populations | Weighted allele model weighted by published log odds | — | — | 3 |
Wang 2013 (48) | Taiwan (d, v) | CRC | 16 SNPs | Published GWAS studies from Asian populations followed by replication analysis and jack-knife selection | Logistic regression | — | — | 1b |
Xin 2018a (27) | China (d, v) | CRC | 14 SNPs | Published GWAS studies from European or Asian populations | Unweighted allele counting model; weighted allele model weighted by published log odds; weighted allele model weighted by study-derived weights | — | — | 3 |
Genetic plus phenotypic risk factors excluding age | ||||||||
Ibanez-Sanz 2017b (21) | Spain (d, v) | CRC | 21 SNPs, family history of CRC, alcohol use, BMI, physical exercise, red meat and vegetable intake, and NSAIDs/aspirin use | Published GWAS studies included within European Bioinformatics Institute | Unweighted allele counting model (weighted allele models weighted by published log odds and study-derived log odds similar so not reported) | Logistic regression | Logistic regression | 1b |
Jeon 2018a (25) | Australia, Canada, Germany, Israel, and USA (d, v) | CRC (female) | 63 SNPs, height, BMI, education, history of type 2 diabetes mellitus, smoking status, alcohol consumption, regular aspirin use, regular NSAID use, regular use of postmenopausal hormones, smoking, intake of fiber, calcium, folate, processed meat, red meat, fruit, vegetables, total-energy, and physical activity | Published GWAS studies from predominantly European and Asian populations | Weighted allele model weighted by study-derived estimated regression coefficients | No details given—all considered included | Logistic regression | 2a |
Jeon 2018b (25) | Australia, Canada, Germany, Israel and USA. (d, v) | CRC (male) | 63 SNPs, height, BMI, education, history of type 2 diabetes mellitus, smoking status, alcohol consumption, regular aspirin use, regular NSAID use, smoking, intake of fiber, calcium, folate, processed meat, red meat, fruit, vegetables, total-energy, and physical activity | Published GWAS studies from predominantly European and Asian populations | Weighted allele model weighted by study derived estimated regression coefficients | No details given—all considered included | Logistic regression | 2a |
Procopciuc 2017 (18) | Romania (d) | CRC | 7 SNPs, gender, alcohol, and fried red meat | Candidate genes on metabolic pathway | Logistic regression | Logistic regression | Logistic regression | 1a |
Xin 2018b (27) | China (d, v) | CRC | 14 SNPs, smoking status | Published GWAS studies from European or Asian populations | Unweighted allele counting model | No details given—all considered included | Logistic regression | 3 |
Yarnall 2013 (49) | UK (v) | CRC | 14 SNPs, BMI, smoking, alcohol, fiber intake, red meat intake, and physical activity | Published GWAS studies from predominantly European populations | Simulation based procedure using REGENT software | Literature review—all considered included | Simulation based procedure using REGENT software | 3b |
Genetic plus phenotypic risk factors including age | ||||||||
Abe 2017 (50) | Japan (d, v) | CRC | 11 SNPs, age, sex, referral pattern, current BMI, smoking, alcohol consumption, regular exercise, family history of colorectal cancer in a first degree relative, and dietary folate intake | Published GWAS studies from European and East Asian populations followed by logistic regression | Unweighted allele counting model | No details given—all considered included | Logistic regression | 2b |
Dunlop 2013b (26) | UK, Canada, Australia, USA, and Germany (d) | CRC | 10 SNPs, age, gender, and first degree relative with CRC | Published GWAS studies from European populations | Unweighted allele counting model | No details given—all considered included | Logistic regression | 3 |
Sweden and Finland (v) | ||||||||
Hosono 2016b (47) | Japan (d, v) | CRC | 6 SNPs, age, referral pattern, current BMI, smoking, alcohol consumption, regular exercise, family history of CRC, and dietary folate intake | Published GWAS studies from European populations followed by logistic regression | Unweighted allele counting model | No details given—all considered included | Logistic regression | 2b |
Hsu 2015 (24) | USA and Germany (d, v) | CRC | 27 SNPs, age, sex, family history of CRC, history of endoscopic examinations | Previous GWAS studies from European and East Asian populations | Unweighted allele counting model (weighted model weighted by published log odds similar so not reported) | No details given—all considered included | Logistic regression | 3 |
Iwasaki 2017 (22) | Japan (d, v) | CRC (male) | 6 SNPs, age, BMI, alcohol, and smoking status | Previous published model and GWAS from European and East Asian populations followed by cox proportional hazards modeling | Weighted allele model weighted by study-derived log-transformed per allele HR | From previous model (Ma) except for physical activity | Weighted cox proportional hazards regression | 1b |
Jo 2012a (17) | Korea (d, v) | CRC (female) | 5 SNPs, age, and family history of CRC | GWAS study in Korean population with significance level of P < 10−6 | Unweighted allele counting model; weighted allele model weighted by study-derived beta coefficients | No details given—all considered included | Logistic regression | 1b |
Jo 2012b (17) | Korea (d, v) | CRC (male) | 3 SNPs, age, and family history of CRC | GWAS study in Korean population with significance level of P < 10−6 | Unweighted allele counting model; weighted allele model weighted by study-derived beta coefficients | No details given—all considered included | Logistic regression | 1b |
Jung 2015 (16) | South Korea (d) | CRC, colon, and rectal cancer | 7 SNPs, age, sex, smoking status, exercise status, fasting serum glucose, and family history of CRC | Published GWAS studies from predominantly European and Asian populations followed by logistic regression | Unweighted allele counting model; weighted allele model weighted by study-derived beta coefficients | No details given—all considered included | Cox proportional hazards regression | 1a |
Jung 2019 (20) | USA (d) | CRC | 4 SNPs, age, and percentage calories from saturated fatty acids | Candidate genes related to insulin growth–like factor and insulin | Weighted allele model weighted by predictive value assessed via minimal depth method in nested random survival forest models | Multi-collinearity testing and univariate and stepwise regression analyses for final set to be included. | Random survival forest analysis | 1a |
Li 2015 (51) | China (d) | CRC | 7 SNPs, age, sex, and smoking, drinking | NHGRI GWAS database | Unweighted allele counting model; weighted allele model weighted by study-derived beta coefficients | No details given—all considered included | Logistic regression | 1a |
Shiao 2018 (19) | USA (d, v) | CRC | 5 SNPs, age, gender, BMI, thiamine, MTHFRR 677 expression level, HEI score (calories, total fruit, whole fruit, vegetables, dark green, total grains, whole grains, dairy, protein, oil and nuts, saturated fat, sodium, and empty calories) | Candidate genes related to folate metabolism | Unweighted allele counting model | Bootstrap forest prediction modeling | Generalized regression elastic net model (penalized regression) | 1b |
Smith 2018b (23) | UK (d, v) | CRC | 41 SNPs, age, and family history | Published GWAS studies from predominantly European and white populations | Weighted allele model weighted by published log odds | Factors included in Taylor and colleagues model | Standard model: log GRS combined with predicted log HR original model. | 3 |
Smith 2018c (23) | UK (d, v) | CRC | 41 SNPs, age, diabetes, multi-vitamin usage, family history, years of education, BMI, alcohol intake, physical activity, NSAID usage, red meat intake, smoking, and estrogen use (women only) | Published GWAS studies from predominantly European and white populations | Weighted allele model weighted by published log odds | Factors included in Wells and colleagues model | Standard model: log GRS combined with predicted log HR original model. | 3 |
Weigl 2018 (15) | Germany (d) | CRC or advanced adenoma | 48 SNPs, age, sex, previous colonoscopy, physical activity, and BMI | Published GWAS studies from European populations | Unweighted allele counting model | Factors statistically associated with genetic risk categories in controls | Logistic regression | 1a |
Author, year . | Country . | Outcome . | Factors included in score . | Selection of SNPs . | Method of development of GRS . | Selection of phenotypic factors . | Method of development of combined model . | TRIPOD levela . |
---|---|---|---|---|---|---|---|---|
Genetic risk factors alone | ||||||||
Dunlop 2013a (26) | UK, Canada, Australia, USA, and Germany (d) | CRC | 10 SNPs | Published GWAS studies from European populations | Unweighted allele counting model | — | — | 3 |
Sweden and Finland (v) | ||||||||
Frampton 2016 (12) | UK (v) | CRC | 37 SNPs | Published GWAS studies from European populations | Weighted allele model weighted by published log odds | — | — | 3 |
Hosono 2016a (47) | Japan (d, v) | CRC | 6 SNPs | Published GWAS studies from European populations followed by logistic regression | Unweighted allele counting model | — | — | 2b |
Huyghe 2019 (34) | European (91.7%) and East Asian (8.3%; d) | CRC | 95 SNPs | GWAS study | Weighted allele model weighted by study derived weights | — | — | 1a |
Ibanez-Sanz 2017a (21) | Spain (d, v) | CRC | 21 SNPs | Published GWAS studies included within European Bioinformatics Institute | Unweighted allele counting model (weighted allele models weighted by published log odds and study-derived log odds similar so not reported) | — | — | 3 |
Jenkins 2016 (46) | Australia, Canada, and USA (v) | CRC | 45 SNPs | Published GWAS studies from European populations | Weighted allele model weighted by published log odds | — | — | 3b, 4b |
Smith 2018a (23) | UK (d, v) | CRC | 41 SNPs | Published GWAS studies from predominantly European and white populations | Weighted allele model weighted by published log odds | — | — | 3 |
Wang 2013 (48) | Taiwan (d, v) | CRC | 16 SNPs | Published GWAS studies from Asian populations followed by replication analysis and jack-knife selection | Logistic regression | — | — | 1b |
Xin 2018a (27) | China (d, v) | CRC | 14 SNPs | Published GWAS studies from European or Asian populations | Unweighted allele counting model; weighted allele model weighted by published log odds; weighted allele model weighted by study-derived weights | — | — | 3 |
Genetic plus phenotypic risk factors excluding age | ||||||||
Ibanez-Sanz 2017b (21) | Spain (d, v) | CRC | 21 SNPs, family history of CRC, alcohol use, BMI, physical exercise, red meat and vegetable intake, and NSAIDs/aspirin use | Published GWAS studies included within European Bioinformatics Institute | Unweighted allele counting model (weighted allele models weighted by published log odds and study-derived log odds similar so not reported) | Logistic regression | Logistic regression | 1b |
Jeon 2018a (25) | Australia, Canada, Germany, Israel, and USA (d, v) | CRC (female) | 63 SNPs, height, BMI, education, history of type 2 diabetes mellitus, smoking status, alcohol consumption, regular aspirin use, regular NSAID use, regular use of postmenopausal hormones, smoking, intake of fiber, calcium, folate, processed meat, red meat, fruit, vegetables, total-energy, and physical activity | Published GWAS studies from predominantly European and Asian populations | Weighted allele model weighted by study-derived estimated regression coefficients | No details given—all considered included | Logistic regression | 2a |
Jeon 2018b (25) | Australia, Canada, Germany, Israel and USA. (d, v) | CRC (male) | 63 SNPs, height, BMI, education, history of type 2 diabetes mellitus, smoking status, alcohol consumption, regular aspirin use, regular NSAID use, smoking, intake of fiber, calcium, folate, processed meat, red meat, fruit, vegetables, total-energy, and physical activity | Published GWAS studies from predominantly European and Asian populations | Weighted allele model weighted by study derived estimated regression coefficients | No details given—all considered included | Logistic regression | 2a |
Procopciuc 2017 (18) | Romania (d) | CRC | 7 SNPs, gender, alcohol, and fried red meat | Candidate genes on metabolic pathway | Logistic regression | Logistic regression | Logistic regression | 1a |
Xin 2018b (27) | China (d, v) | CRC | 14 SNPs, smoking status | Published GWAS studies from European or Asian populations | Unweighted allele counting model | No details given—all considered included | Logistic regression | 3 |
Yarnall 2013 (49) | UK (v) | CRC | 14 SNPs, BMI, smoking, alcohol, fiber intake, red meat intake, and physical activity | Published GWAS studies from predominantly European populations | Simulation based procedure using REGENT software | Literature review—all considered included | Simulation based procedure using REGENT software | 3b |
Genetic plus phenotypic risk factors including age | ||||||||
Abe 2017 (50) | Japan (d, v) | CRC | 11 SNPs, age, sex, referral pattern, current BMI, smoking, alcohol consumption, regular exercise, family history of colorectal cancer in a first degree relative, and dietary folate intake | Published GWAS studies from European and East Asian populations followed by logistic regression | Unweighted allele counting model | No details given—all considered included | Logistic regression | 2b |
Dunlop 2013b (26) | UK, Canada, Australia, USA, and Germany (d) | CRC | 10 SNPs, age, gender, and first degree relative with CRC | Published GWAS studies from European populations | Unweighted allele counting model | No details given—all considered included | Logistic regression | 3 |
Sweden and Finland (v) | ||||||||
Hosono 2016b (47) | Japan (d, v) | CRC | 6 SNPs, age, referral pattern, current BMI, smoking, alcohol consumption, regular exercise, family history of CRC, and dietary folate intake | Published GWAS studies from European populations followed by logistic regression | Unweighted allele counting model | No details given—all considered included | Logistic regression | 2b |
Hsu 2015 (24) | USA and Germany (d, v) | CRC | 27 SNPs, age, sex, family history of CRC, history of endoscopic examinations | Previous GWAS studies from European and East Asian populations | Unweighted allele counting model (weighted model weighted by published log odds similar so not reported) | No details given—all considered included | Logistic regression | 3 |
Iwasaki 2017 (22) | Japan (d, v) | CRC (male) | 6 SNPs, age, BMI, alcohol, and smoking status | Previous published model and GWAS from European and East Asian populations followed by cox proportional hazards modeling | Weighted allele model weighted by study-derived log-transformed per allele HR | From previous model (Ma) except for physical activity | Weighted cox proportional hazards regression | 1b |
Jo 2012a (17) | Korea (d, v) | CRC (female) | 5 SNPs, age, and family history of CRC | GWAS study in Korean population with significance level of P < 10−6 | Unweighted allele counting model; weighted allele model weighted by study-derived beta coefficients | No details given—all considered included | Logistic regression | 1b |
Jo 2012b (17) | Korea (d, v) | CRC (male) | 3 SNPs, age, and family history of CRC | GWAS study in Korean population with significance level of P < 10−6 | Unweighted allele counting model; weighted allele model weighted by study-derived beta coefficients | No details given—all considered included | Logistic regression | 1b |
Jung 2015 (16) | South Korea (d) | CRC, colon, and rectal cancer | 7 SNPs, age, sex, smoking status, exercise status, fasting serum glucose, and family history of CRC | Published GWAS studies from predominantly European and Asian populations followed by logistic regression | Unweighted allele counting model; weighted allele model weighted by study-derived beta coefficients | No details given—all considered included | Cox proportional hazards regression | 1a |
Jung 2019 (20) | USA (d) | CRC | 4 SNPs, age, and percentage calories from saturated fatty acids | Candidate genes related to insulin growth–like factor and insulin | Weighted allele model weighted by predictive value assessed via minimal depth method in nested random survival forest models | Multi-collinearity testing and univariate and stepwise regression analyses for final set to be included. | Random survival forest analysis | 1a |
Li 2015 (51) | China (d) | CRC | 7 SNPs, age, sex, and smoking, drinking | NHGRI GWAS database | Unweighted allele counting model; weighted allele model weighted by study-derived beta coefficients | No details given—all considered included | Logistic regression | 1a |
Shiao 2018 (19) | USA (d, v) | CRC | 5 SNPs, age, gender, BMI, thiamine, MTHFRR 677 expression level, HEI score (calories, total fruit, whole fruit, vegetables, dark green, total grains, whole grains, dairy, protein, oil and nuts, saturated fat, sodium, and empty calories) | Candidate genes related to folate metabolism | Unweighted allele counting model | Bootstrap forest prediction modeling | Generalized regression elastic net model (penalized regression) | 1b |
Smith 2018b (23) | UK (d, v) | CRC | 41 SNPs, age, and family history | Published GWAS studies from predominantly European and white populations | Weighted allele model weighted by published log odds | Factors included in Taylor and colleagues model | Standard model: log GRS combined with predicted log HR original model. | 3 |
Smith 2018c (23) | UK (d, v) | CRC | 41 SNPs, age, diabetes, multi-vitamin usage, family history, years of education, BMI, alcohol intake, physical activity, NSAID usage, red meat intake, smoking, and estrogen use (women only) | Published GWAS studies from predominantly European and white populations | Weighted allele model weighted by published log odds | Factors included in Wells and colleagues model | Standard model: log GRS combined with predicted log HR original model. | 3 |
Weigl 2018 (15) | Germany (d) | CRC or advanced adenoma | 48 SNPs, age, sex, previous colonoscopy, physical activity, and BMI | Published GWAS studies from European populations | Unweighted allele counting model | Factors statistically associated with genetic risk categories in controls | Logistic regression | 1a |
Abbreviations: CRC, colorectal cancer; d, development; v, validation; wGRS, weighted genetic risk score.
aTripod level: 1a, development only; 1b, development and validation using resampling; 2a, random split-sample development and validation; 2b, nonrandom split-sample development and validation; 3, development and validation using separate data; 4, external validation.
bSimulated population.
Development of the risk models and risk of bias
Details of the methods used to select the predictors and develop each of the risk models are given in Table 1, with additional details of the setting, design, participants, outcome, and sample size for each study in Supplementary Table S1. The majority of the risk models (n = 18) were developed or validated in white or European individuals. The others were developed or validated in Japanese (n = 4), Korean (n = 3), Chinese (n = 3), and Taiwanese (n = 1) populations.
A summary of the assessment of the risk of bias based on the four domains from the CHARMS checklist (study population, predictors, outcome, and sample size and missing data) is shown in Table 2. Overall we found 12 risk models to be at low risk of bias, 10 at unclear risk, and five at high risk.
Author, year . | Study participants . | Predictors . | Outcome . | Sample size and missing data . | Overall . |
---|---|---|---|---|---|
Genetic risk factors alone | |||||
Dunlop 2013a (26) | + | + | + | + | + |
Frampton 2016 (12) | ? | + | + | ? | ? |
Hosono 2016a (47) | ? | ? | + | ? | ? |
Huyghe 2019 (34) | + | + | + | ? | + |
Ibanez-Sanz 2017a (21) | + | + | + | ? | + |
Jenkins 2016 (46), 2019 (30) | + | + | + | ? | + |
Smith 2018a (23) | + | + | + | + | + |
Wang 2013 (48) | ? | − | + | ? | − |
Xin 2018a (27) | ? | + | + | ? | ? |
Genetic plus phenotypic risk factors excluding age | |||||
Ibanez-Sanz 2017b (21) | + | + | + | ? | + |
Jeon 2018a and b (25) | + | + | + | ? | + |
Procopciuc 2017 (18) | − | ? | + | − | − |
Xin 2018b (27) | ? | ? | + | ? | ? |
Yarnell 2013 (49) | ? | + | + | ? | ? |
Genetic plus phenotypic risk factors plus age | |||||
Abe 2017 (50) | ? | + | + | ? | ? |
Dunlop 2013b (26) | + | ? | + | + | + |
Hosono 2016b (47) | ? | ? | + | ? | ? |
Hsu 2015b (24) | + | ? | + | ? | ? |
Iwasaki 2017b (22) | + | + | + | ? | + |
Jo 2012a and b (17) | ? | − | + | − | − |
Jung 2015 (16) | + | ? | + | ? | ? |
Jung 2019 (20) | + | − | + | ? | − |
Li 2015 (51) | ? | ? | + | ? | ? |
Shiao 2018 (19) | − | ? | + | − | − |
Smith 2018b (23) | + | + | + | + | + |
Smith 2018c (23) | + | + | + | + | + |
Weigl 2018 (15) | + | + | + | ? | + |
Author, year . | Study participants . | Predictors . | Outcome . | Sample size and missing data . | Overall . |
---|---|---|---|---|---|
Genetic risk factors alone | |||||
Dunlop 2013a (26) | + | + | + | + | + |
Frampton 2016 (12) | ? | + | + | ? | ? |
Hosono 2016a (47) | ? | ? | + | ? | ? |
Huyghe 2019 (34) | + | + | + | ? | + |
Ibanez-Sanz 2017a (21) | + | + | + | ? | + |
Jenkins 2016 (46), 2019 (30) | + | + | + | ? | + |
Smith 2018a (23) | + | + | + | + | + |
Wang 2013 (48) | ? | − | + | ? | − |
Xin 2018a (27) | ? | + | + | ? | ? |
Genetic plus phenotypic risk factors excluding age | |||||
Ibanez-Sanz 2017b (21) | + | + | + | ? | + |
Jeon 2018a and b (25) | + | + | + | ? | + |
Procopciuc 2017 (18) | − | ? | + | − | − |
Xin 2018b (27) | ? | ? | + | ? | ? |
Yarnell 2013 (49) | ? | + | + | ? | ? |
Genetic plus phenotypic risk factors plus age | |||||
Abe 2017 (50) | ? | + | + | ? | ? |
Dunlop 2013b (26) | + | ? | + | + | + |
Hosono 2016b (47) | ? | ? | + | ? | ? |
Hsu 2015b (24) | + | ? | + | ? | ? |
Iwasaki 2017b (22) | + | + | + | ? | + |
Jo 2012a and b (17) | ? | − | + | − | − |
Jung 2015 (16) | + | ? | + | ? | ? |
Jung 2019 (20) | + | − | + | ? | − |
Li 2015 (51) | ? | ? | + | ? | ? |
Shiao 2018 (19) | − | ? | + | − | − |
Smith 2018b (23) | + | + | + | + | + |
Smith 2018c (23) | + | + | + | + | + |
Weigl 2018 (15) | + | + | + | ? | + |
NOTE: +, low risk; ?, unclear risk; −, high risk.
Risk of bias within the study participant domain was variable between studies. Those judged to be at unclear or high risk of bias reflected limited or missing details on the inclusion and exclusion criteria used to define study participants and/or use of cases or controls not representative of the general population, for example recruiting spouses or individuals attending outpatient hospital clinics as controls, or recruiting cases from adjuvant chemotherapy clinical trials.
When considering selection of predictors, the majority of the models (n = 18) included SNPs identified for inclusion from new or previously published GWASs in European or Asian-ancestry populations. In six, the authors had used GWAS studies from European or Asian populations to identify SNPs associated with colorectal cancer risk and then selected a subset of these SNPs for inclusion in the risk model on the basis of the associations with disease risk in an independent Japanese or Taiwanese population. Although this method was used to identify SNPs that may be associated with risk in non-European populations, given the small sample sizes of many of the studies and low statistical power, this approach potentially excludes SNPs that are associated with risk in these populations. Two models (17) were developed on the basis of a GWAS study in a Korean population by selecting SNPs with evidence of association at the P < 10−6 significance level (which is less conservative than the conventionally accepted genome-wide level of significance for a GWAS study, P < 5 × 10−8). Three more studies (18–20) selected SNPs on the basis of plausible biological mechanisms leading to colorectal cancer and epidemiologic studies (folate metabolism, DNA repair, and breakdown of carcinogenic compounds, insulin-like growth factor and insulin). One of these, the model by Jung and colleagues (20), included both SNPs related to insulin metabolism and dietary fatty acids, potentially overestimating the risk for individuals with the risk allele.
Of the 20 models that include phenotypic risk factors, with or without age, in addition to SNPs, four used regression analyses to select which factors to include (15, 18, 20, 21), one a bootstrap forest prediction model (19), and three (22, 23) used risk factors identified from previous risk models. However, for the majority (n = 12) of models the publications included few details about how phenotypic factors were selected, and whether all those that had been considered were included in the final model. As a consequence, many do not include established risk factors for colorectal cancer.
The outcome (colorectal cancer) was defined histologically or from cancer registries in all studies, reducing the risk of bias due to case misclassification. All studies reported the number of cases and controls used in their development and/or validation analyses. Three included fewer than 150 cases (and hence had low statistical power). Only five studies adequately described how they dealt with missing data, so we cannot be certain that this was done appropriately in the remaining studies.
Discrimination and calibration of the risk models
Discrimination, as measured by the AUROC or C-statistic, was reported for 27 of the 29 risk models and calibration reported for four. The discrimination values are summarized graphically in Fig. 1 and given in Supplementary Table S2, in which models are divided into those that include SNPs only and those that combine SNPs with phenotypic variables with or without age and whether the discrimination was assessed in the development population, bootstrap or a random-split sample, or in an external population or nonrandom-split sample. Where multiple AUROCs or C-statistics for the same model were reported for more than one method, measurement in the development populations always gave the highest discrimination, followed by that in bootstrapping or random-split sample validation studies and then in external populations. Where model performance was included for both men and women, discrimination was higher in men (0.59 in men compared with 0.56 in women, ref. 24; 0.63 in men compared with 0.62 in women, ref. 25; and 0.70 in men compared with 0.60 in women, ref. 17).
Among the eight models that include only SNPs, the discrimination of seven was reported in external populations. This ranged between 0.56 and 0.60 in real-life populations and 0.63 in simulated populations. Of those assessed in real-life populations, the three considered at low risk of bias (Dunlop and colleagues, ref. 26; Ibanez-Sanz and colleagues, ref. 21; and Smith and colleagues, ref. 23) all have reported AUROCs of 0.56–0.57. Of the 19 risk models incorporating both SNPs and phenotypic variables, the models created by Procopciuc and colleagues (18), Jung and colleagues (20), and Shiao and colleagues (19) have the highest reported discrimination with AUROCs of 0.90 [95% confidence interval (CI), 0.86–0.93] in the development population, 0.93 in the development population, and 0.85 in cross-validation, respectively. In all three cases the SNPs were selected on the basis of candidate–gene association studies as opposed to GWAS studies. The models by Procopciuc and colleagues and Shiao and colleagues were also developed in small case–control studies with only 150 and 53 cases and 162 and 53 controls, respectively, thus the resulting models are likely subject to a high degree of overfitting.
In the remaining models, in which the SNP selection was on the basis of published GWASs, the AUROC in split sample validation or external validation in independent datasets ranged between 0.61 and 0.63 in models excluding age and 0.56 and 0.70 in those including age. The best performing model in an independent validation population was the model by Smith and colleagues (23). Calibration was reported for only four of the 29 risk models. In three, the numbers of predicted colorectal cancers were in line with the observed numbers with nonsignificant P values of 0.086 (18) and 0.336 (27) under a Hosmer–Lemeshow statistic and 0.09 under a Grønnesby and Borgan test (22), respectively. Smith and colleagues (23) assessed calibration graphically and found that the genetic risk score alone was poorly calibrated, with overestimation of risk for those in the top decile of risk. After recalibration, however, both the genetic risk score alone and the genetic plus phenotypic models were well calibrated.
Incremental improvement of genetic over family history and/or phenotypic risk factors
Of the models that combined SNPs with family history and/or phenotypic risk factors, 15 compared the discrimination of models including SNPs, family history, and phenotypic risk factors either alone or in combination (Table 3). Together these showed that adding SNPs to family history and/or phenotypic variables, and vice versa, leads to an increase in the AUROC of between 0.01 and 0.06. For example, in a cross-validation sample of a Spanish population, Ibanez-Sanz and colleagues, report an AUROC of 0.61 (95% CI, 0.59–0.64) for their environmental risk score comprising alcohol use, family history of colorectal cancer, BMI, physical exercise, red meat and vegetable intake, and NSAIDs/aspirin use, and an AUROC of 0.56 (95% CI, 0.54–0.58) for their genetic risk score comprising 21 SNPs. For the combined risk score, they report an AUROC of 0.63 (95% CI, 0.60–0.66; ref. 21). Iwasaki and colleagues (22), Xin and colleagues (27), and Weigl and colleagues (15) additionally reported that adding genetic risk factors to a model including phenotypic risk factors increased the mean integrated discrimination improvement (IDI) by 0.015 (95% CI, 0.0044–0.027), 0.031 (95% CI, 0.023–0.039), and 0.04 (95% CI, 0.03–0.05), respectively, and the mean continuous net reclassification index (NRI) by 0.39 (95% CI, 0.17–0.58), 0.317 (95% CI, 0.225–0.408), and 0.29 (95% CI, 0.14–0.43), respectively. The study by Smith and colleagues, in which a genetic risk score incorporating 41 SNPs identified from previous GWAS studies was added to two previously published phenotypic risk scores including age and family history of colorectal cancer (28, 29) found that the genetic risk score did not meaningfully improve model discrimination. They did not report the IDI or NRI but overall the addition of genetic information resulted in 4%–5% of individuals having a change in absolute risk of ≥0.3%. For those with an initial estimated absolute risk of <1%, this percentage was 3% and for those with an estimated absolute risk ≥1% and 25%–33% had a change in absolute risk of ≥0.3%.
Author, year . | Genetic risk factors only [AUROC (95% CI)] . | Family history alone [AUROC (95% CI)] . | Phenotypic risk factors only [AUROC (95% CI)] . | Genetic risk factors and family history [AUROC (95% CI)] . | Phenotypic risk factors and familyhistory [AUROC (95% CI)] . | Genetic and phenotypic risk factors combined [AUROC (95% CI)] . | Genetic risk factors, family history, and phenotypic risk factors combined [AUROC (95% CI)] . |
---|---|---|---|---|---|---|---|
Dunlop 2013 (26) | 0.57 | 0.59 | |||||
Hosono 2016 (47) | 0.60 | 0.70 | 0.72 | ||||
Hsu 2015 (24) | Women 0.55 | Women 0.52 | Women 0.56 | ||||
Men 0.60 | Men 0.51 | Men 0.59 | |||||
Ibanez-Sanz 2017 (21) | 0.56 (0.54–0.58) | 0.60 (0.57–0.61) | 0.61 (0.59–0.64) | 0.63 (0.60–0.66) | |||
Iwasaki 2017 (22) | 0.63a | 0.60a | 0.66a | ||||
Jeon 2018a (female; ref. 25) | 0.54 (0.52–0.55) | 0.59 (0.58–0.60) | 0.60 (0.59–0.61) | 0.62 (0.61–0.63) | |||
Jeon 2018b (male; ref. 25) | 0.53 (0.52–0.54) | 0.59 (0.58–0.60) | 0.60 (0.59–0.61) | 0.63 (0.62–0.64) | |||
Jo 2012 (17) | Women: 0.60 (0.57–0.64) | Women:0.65 (0.62–0.68) | |||||
Men: 0.69 (0.65–0.73) | Men: 0.73 (0.68–0.77) | ||||||
Jung 2015 (16) | 0.73 (0.69–0.78) | 0.74 (0.70–0.78) | |||||
Smith 2018a and b (23) | 0.56 (0.55–0.58) | 0.67 (0.65–0.68) Excluding age: 0.52 (0.51–0.53) | 0.68 (0.66–0.69) | ||||
Smith 2018a and c (23) | 0.57 (0.55–0.58) | 0.68 (0.67–0.69) Excluding age: 0.58 (0.57–0.60) | 0.69 (0.67–0.70) | ||||
Li 2015 (51) | 0.57 (0.55–0.59) | 0.59 (0.57–0.61) | |||||
Weigl 2018 (15) | 0.62 | 0.67 | |||||
Xin 2018b (27) | 0.52 (0.50–0.54) | 0.61 (0.58–0.63) |
Author, year . | Genetic risk factors only [AUROC (95% CI)] . | Family history alone [AUROC (95% CI)] . | Phenotypic risk factors only [AUROC (95% CI)] . | Genetic risk factors and family history [AUROC (95% CI)] . | Phenotypic risk factors and familyhistory [AUROC (95% CI)] . | Genetic and phenotypic risk factors combined [AUROC (95% CI)] . | Genetic risk factors, family history, and phenotypic risk factors combined [AUROC (95% CI)] . |
---|---|---|---|---|---|---|---|
Dunlop 2013 (26) | 0.57 | 0.59 | |||||
Hosono 2016 (47) | 0.60 | 0.70 | 0.72 | ||||
Hsu 2015 (24) | Women 0.55 | Women 0.52 | Women 0.56 | ||||
Men 0.60 | Men 0.51 | Men 0.59 | |||||
Ibanez-Sanz 2017 (21) | 0.56 (0.54–0.58) | 0.60 (0.57–0.61) | 0.61 (0.59–0.64) | 0.63 (0.60–0.66) | |||
Iwasaki 2017 (22) | 0.63a | 0.60a | 0.66a | ||||
Jeon 2018a (female; ref. 25) | 0.54 (0.52–0.55) | 0.59 (0.58–0.60) | 0.60 (0.59–0.61) | 0.62 (0.61–0.63) | |||
Jeon 2018b (male; ref. 25) | 0.53 (0.52–0.54) | 0.59 (0.58–0.60) | 0.60 (0.59–0.61) | 0.63 (0.62–0.64) | |||
Jo 2012 (17) | Women: 0.60 (0.57–0.64) | Women:0.65 (0.62–0.68) | |||||
Men: 0.69 (0.65–0.73) | Men: 0.73 (0.68–0.77) | ||||||
Jung 2015 (16) | 0.73 (0.69–0.78) | 0.74 (0.70–0.78) | |||||
Smith 2018a and b (23) | 0.56 (0.55–0.58) | 0.67 (0.65–0.68) Excluding age: 0.52 (0.51–0.53) | 0.68 (0.66–0.69) | ||||
Smith 2018a and c (23) | 0.57 (0.55–0.58) | 0.68 (0.67–0.69) Excluding age: 0.58 (0.57–0.60) | 0.69 (0.67–0.70) | ||||
Li 2015 (51) | 0.57 (0.55–0.59) | 0.59 (0.57–0.61) | |||||
Weigl 2018 (15) | 0.62 | 0.67 | |||||
Xin 2018b (27) | 0.52 (0.50–0.54) | 0.61 (0.58–0.63) |
aAll models include age in addition to genomic and/or phenotypic risk factors.
Impact of stratifying populations for screening based on genetic risk
Eight studies assessed the potential impact of using the risk models to determine the starting age for screening. Seven of these calculated either the difference in recommended starting age for those at low or high risk or the years earlier those at high risk would be invited. These are summarized in Table 4. Considering SNPs alongside family history would result in individuals in the highest quintile of risk, for example, being invited between 13 and 21 years earlier, with the difference between the invitation ages of the highest quintile being and lowest quintile between 13 and 27 years. In all cases where estimates were provided for SNPs alone, family history alone, or SNPs and family history combined, the range was greater for SNPs than family history and greater for both combined than for either individually. Jenkins and colleagues (30) additionally estimated that if those in the highest quintile of risk were invited for screening at age 46 and those in the lowest quintile at age 59, 3.32 million people would be screened earlier, of which 8,000 of those would be diagnosed with colorectal cancer, and 8.76 million would be screened later, of which 18,000 would be diagnosed with colorectal cancer.
. | . | . | Difference in years in recommended starting age for screening between those in the highest and lowest percentiles of risk . | |||
---|---|---|---|---|---|---|
Author, year . | Model-specific risk threshold used to determine starting age for screening . | Type of risk model/included risk factors . | Papers selecting the top and bottom 1% of risk for comparison . | Papers selecting the top and bottom 10% of risk for comparison . | Papers selecting the top and bottom 20% of riskfor comparison . | Papers selecting the top and bottom 33% of risk for comparison . |
Hsu 2015 (24) | Average 10-year | FH | — | Men: 5 (range 44–49)a | — | — |
risk of a 50-year | Women: 4 (range 50–54)a | |||||
old (0.91%) | FH + SNPs | — | Men: 10 (range 42–52) | — | — | |
Women: 11 (range 47–58) | ||||||
Jenkins 2019 (30) | 0.3% 5-year estimated risk | SNPs | — | — | Men: 10 (range 45–55) | — |
Women: 14 (range 47–61) | ||||||
FH + SNPs | — | — | Men: 22 (range 35–57) | — | ||
Women: 27 (range 35–62) | ||||||
Jenkins 2016 (46) (USA) | 1% 5-year estimated risk | FH + SNPs | — | Men: 27 (range 46–73) | Men: 18 (range 48–66) | — |
Women: 32 (range 48–80) | Women: 21 (range 52–73) | |||||
Jenkins 2016 (46) (Australia) | 1% 5-year estimated risk | FH + SNPs | — | Men: 17 (range 46–63) | Men: 13 (range 48–61) | — |
Women: 23 (range 53–76) | Women: 17 (range 55–72) | |||||
Jeon 2018 (25) | Average 10-year risk of a 50-year old (0.97%) | FH + SNPs + phenotypic | Men:17 (range 38–55) | Men: 11 (range 40–51) | — | — |
Women:21 (range 43–64) | Women: 13 (range 46–59) | |||||
Huyghe 2018 (34) | Average 10-year risk of a 50-year old (1.13% for men and 0.68% for women) | SNPs | Men: 18 (range 41–59) | Men: 10 (range 44–54) | — | — |
Women: 24 (range 45–69) | Women: 12 (range 49–61) | |||||
Weigl 2018 (15) | Average relative risk for a 60-year old with medium genetic risk | SNPs | — | — | — | 17.5 (range 56–73) |
Years earlier for recommended starting age for those in the highest percentiles | ||||||
Author, year | Risk threshold | Risk factors | 1% | 10% | 20% | 33% |
Dunlop 2013 (26) | 5% 10-yearestimated risk | FH | Men: >15 (from >75) | — | — | — |
Women: > 12 (from >80) | ||||||
FH + SNPs | Men: > 23 (from >75) | — | — | — | ||
Women: >22 (from >80) | ||||||
Jenkins 2016 (46) (USA) | 1% 5-year estimated risk | FH | — | Men: 12 (from 67)a | — | — |
Women: 12 (from 73)a | ||||||
SNPs | — | Men: 14 (from 67) | Men: 10 (from 67) | — | ||
Women: 14 (from 73) | Women: 11 (from 73) | |||||
FH + SNPs | — | Men: 21 (from 67) | Men: 19 (from 67) | — | ||
Women: 25 (from 73) | Women 21 (from 73) | |||||
Jenkins 2016 (46) (Australia) | 1% 5-year estimated risk | FH | — | Men: 9 (from 61)a | — | — |
Women: 12 (from 71)a | ||||||
SNPs | — | Men: 9 (from 61) | Men: 6 (from 61) | — | ||
Women: 12 (from 71) | Women: 9 (from 71) | |||||
FH + SNPs | — | Men: 15 (from 61) | Men: 13 (from 61) | — | ||
Women: 18 (from 71) | Women 16 (from 71) |
. | . | . | Difference in years in recommended starting age for screening between those in the highest and lowest percentiles of risk . | |||
---|---|---|---|---|---|---|
Author, year . | Model-specific risk threshold used to determine starting age for screening . | Type of risk model/included risk factors . | Papers selecting the top and bottom 1% of risk for comparison . | Papers selecting the top and bottom 10% of risk for comparison . | Papers selecting the top and bottom 20% of riskfor comparison . | Papers selecting the top and bottom 33% of risk for comparison . |
Hsu 2015 (24) | Average 10-year | FH | — | Men: 5 (range 44–49)a | — | — |
risk of a 50-year | Women: 4 (range 50–54)a | |||||
old (0.91%) | FH + SNPs | — | Men: 10 (range 42–52) | — | — | |
Women: 11 (range 47–58) | ||||||
Jenkins 2019 (30) | 0.3% 5-year estimated risk | SNPs | — | — | Men: 10 (range 45–55) | — |
Women: 14 (range 47–61) | ||||||
FH + SNPs | — | — | Men: 22 (range 35–57) | — | ||
Women: 27 (range 35–62) | ||||||
Jenkins 2016 (46) (USA) | 1% 5-year estimated risk | FH + SNPs | — | Men: 27 (range 46–73) | Men: 18 (range 48–66) | — |
Women: 32 (range 48–80) | Women: 21 (range 52–73) | |||||
Jenkins 2016 (46) (Australia) | 1% 5-year estimated risk | FH + SNPs | — | Men: 17 (range 46–63) | Men: 13 (range 48–61) | — |
Women: 23 (range 53–76) | Women: 17 (range 55–72) | |||||
Jeon 2018 (25) | Average 10-year risk of a 50-year old (0.97%) | FH + SNPs + phenotypic | Men:17 (range 38–55) | Men: 11 (range 40–51) | — | — |
Women:21 (range 43–64) | Women: 13 (range 46–59) | |||||
Huyghe 2018 (34) | Average 10-year risk of a 50-year old (1.13% for men and 0.68% for women) | SNPs | Men: 18 (range 41–59) | Men: 10 (range 44–54) | — | — |
Women: 24 (range 45–69) | Women: 12 (range 49–61) | |||||
Weigl 2018 (15) | Average relative risk for a 60-year old with medium genetic risk | SNPs | — | — | — | 17.5 (range 56–73) |
Years earlier for recommended starting age for those in the highest percentiles | ||||||
Author, year | Risk threshold | Risk factors | 1% | 10% | 20% | 33% |
Dunlop 2013 (26) | 5% 10-yearestimated risk | FH | Men: >15 (from >75) | — | — | — |
Women: > 12 (from >80) | ||||||
FH + SNPs | Men: > 23 (from >75) | — | — | — | ||
Women: >22 (from >80) | ||||||
Jenkins 2016 (46) (USA) | 1% 5-year estimated risk | FH | — | Men: 12 (from 67)a | — | — |
Women: 12 (from 73)a | ||||||
SNPs | — | Men: 14 (from 67) | Men: 10 (from 67) | — | ||
Women: 14 (from 73) | Women: 11 (from 73) | |||||
FH + SNPs | — | Men: 21 (from 67) | Men: 19 (from 67) | — | ||
Women: 25 (from 73) | Women 21 (from 73) | |||||
Jenkins 2016 (46) (Australia) | 1% 5-year estimated risk | FH | — | Men: 9 (from 61)a | — | — |
Women: 12 (from 71)a | ||||||
SNPs | — | Men: 9 (from 61) | Men: 6 (from 61) | — | ||
Women: 12 (from 71) | Women: 9 (from 71) | |||||
FH + SNPs | — | Men: 15 (from 61) | Men: 13 (from 61) | — | ||
Women: 18 (from 71) | Women 16 (from 71) |
aOn the basis of presence or absence of family history (FH), not top and/or bottom 10%.
The eighth study compared the size of the English population eligible for screening and the number of colorectal cancer cases potentially detectable using age-based screening and personalized screening, in which eligibility is determined by absolute risk calculated using age and the Frampton and colleagues risk score (12). In a simulated population ages 55–69 years, 61% of men and 62% of women would be eligible for age-based screening (≥60 years) and 79% and 77%, respectively, of colorectal cancer cases would be diagnosed in this subset. With screening based on the genetic risk score [≥average risk for an individual aged 60 (men 1.96% and women 1.19%)], 45% of men and 45% of women would be eligible for screening with 69% and 69% of colorectal cancer cases being identified. This translates into 16% fewer men and 17% fewer women being eligible for screening at the cost of detecting 10% and 8% fewer cases, respectively.
Discussion
Key findings
We have identified 29 risk models that incorporate common genetic variants to estimate future incidence of colorectal cancer in average-risk populations and that have either published measures of performance or estimates of the implications of using them for stratified screening. In external independent validation datasets, the three models considered at low risk of bias that include SNPs identified from GWASs all had similar discrimination (AUROC, 0.56–0.57; Dunlop and colleagues, ref. 26; Ibanez-Sanz and colleagues, ref. 21; and Smith and colleagues, ref. 23). Among the models that included SNPs in combination with other risk factors, the AUROC in split sample or external validation ranged between 0.61 and 0.63 in models excluding age and 0.56 and 0.70 in those including age. The model with the highest reported discrimination in an independent validation population was the model by Smith and colleagues that included 41 SNPs alongside age, diabetes, multi-vitamin usage, family history, years of education, BMI, alcohol intake, physical activity, NSAID usage, red meat intake, smoking, and estrogen use in women (23). Only four reported data on model calibration. The addition of SNPs to risk scores already including family history and/or phenotypic variables increased discrimination by 0.01 to 0.06. Although this represents a modest increase in discrimination measured in terms of the AUROC, such differences can lead to substantial changes in risk stratification in the population, as illustrated by continuous NRI values of 0.3–0.4 seen in this review and demonstrated in the context of other diseases (31). Public health modeling within the studies suggests that if the models were used to determine the starting age for screening, this would result in individuals in the top 20% for risk being invited up to 23 years earlier than if determined by age-based criteria only, with the difference in age at invitation between the highest and lowest risk quintiles being several years greater for models including SNPs alone than for models including family history alone, and the difference for models including both SNPs and family history greater than that for models including either SNP or family history.
Strengths and limitations
The main strengths of this review are the comprehensive literature search that included both subject headings and free text, and the systematic approach we used to screen papers for inclusion. The inclusion of more than one risk model from many of the published papers also enabled us to make comparisons between models that included different groups of risk factors or had been developed using different statistical methods. Although this approach enabled us to identify 23 risk models that have been published since our earlier review, we cannot exclude the possibility that there are others that we did not identify. Genetic research is also a rapidly advancing field with new articles reporting new genetic variants that could be incorporated into future risk scores being published regularly.
Other limitations of this review relate to the studies themselves. Most of the risk models were developed and/or tested in case–control studies. Estimates of absolute risk of developing colorectal cancer are therefore not possible and the collection of phenotypic risk factors will be subject to both recall and responder bias, potentially increasing the apparent discrimination. Conversely, in many, the matching variables were not included as covariates within the risk models and this may have resulted in underestimation of discrimination (32). The risk models also varied substantially in relation to size, selection of cases and controls, and variables considered for inclusion. This heterogeneity meant it was not possible to assess whether, for example, the number of SNPs affected the performance of the models. Furthermore, most risk models were developed and/or tested in either European, Chinese, or Japanese populations. The risk models in this review may therefore not be applicable to other population groups.
There was also heterogeneity in how the SNPs and phenotypic factors were selected and combined into risk scores, which ultimately impacts their performance in independent samples. For several models SNP selection was based on small sizes and/or there was limited detail on how lifestyle/hormonal risk factors were selected. Similarly, several models did not include well-established risk factors for colorectal cancer. Almost all, however, assumed that the associations of the SNPs are independent from each other and that risk follows an additive model on the log-risk scale. These assumptions are generally considered to be robust (33) and many of the authors describe how they had sought to remove SNPs in linkage disequilibrium or associated with factors on the genetic pathway. In the absence of evidence of interactions, the models also assume that the strengths of associations for each SNP with colorectal cancer are constant with age. This may not be true and further studies are needed to assess for possible interactions.
Finally, in relation to the performance measures for the models, discrimination for many had only been assessed in the development population, no data on discrimination has been published for the genetic model with the largest number of SNPs (34), only four models reported data on calibration, and only two included estimates of net reclassification. As illustrated by the lower AUROCs seen in development populations when compared with the performance of the same models from bootstrapping or cross-validation, the performance of all prediction models is overestimated because of overfitting when both model development and performance assessment use the same dataset, particularly in studies with small sample sizes (35). In addition, while the AUROC or other measures of discrimination are important when considering how well individuals can be ranked in terms of predicted risk, without measures of calibration or reclassification it is not possible to assess how closely the estimated risks match the observed risks, how much including different factors in the risk scores influences the classification of individuals, or whether the models stratify correctly into high/low categories of absolute risk that are of clinical importance.
Implications for future research
This review shows that a large number of risk scores incorporating common genetic markers have been developed to estimate future risk of colorectal cancer and suggests that many of these are better at discriminating between those at higher and lower risk of colorectal cancer than age alone, family history alone, or risk scores incorporating only phenotypic risk factors. As has been described previously (9, 36), risk models such as these could be used to stratify the general population into risk categories, based either on estimates of absolute risk for those models including age or relative risk for those excluding age, to allow screening and preventive strategies to be targeted at those most likely to benefit. While the findings of this review therefore suggest that future risk prediction in colorectal cancer will improve with the inclusion of polygenic risk factors, it remains uncertain how these models would perform in real-life settings and whether the increase in discriminatory performance and wider range of ages at which individuals would become eligible for screening that could be achieved through the inclusion of genetic variables translates into improved health of the population or the cost effectiveness of a screening program.
First, many of these models have not been externally validated and very few have had calibration assessed. As described above, these steps are essential before risk models can be incorporated into practice. To enable direct comparisons between the models, ideally the models identified in this review with the greatest number of SNPs and those with the highest reported discrimination would be assessed in a single independent cohort. However, the predictive ability of risk models is known to vary between populations and the risk of developing colorectal cancer varies substantially worldwide (37). The choice of models for independent validation will therefore depend on the population of interest and these analyses should be performed in populations similar to those in which use of the model is being considered. This is particularly important in the context of genetic risk models. Comparisons between the population genetics of different ethnic groups have shown that the estimated associated risks and population frequencies of SNPs can vary substantially with ethnicity (38, 39) and the overall magnitude of association of polygenic risk scores derived from GWAS in European-ancestry populations, as is the case for most models for colorectal cancer, may differ when applied to other populations (40). As highlighted by De La Vega and Bustamante, to avoid further inequities in health outcomes, the inclusion of diverse populations in colorectal cancer research, unbiased genotyping, and methods of bias reduction in genetic risk scores are critical (41).
Second, further methodologic studies are required to improve genome-wide risk prediction to understand the potential benefits of including increasing numbers of SNPs, together with other rare moderate/high risk genetic variants and established or new lifestyle/environmental risk factors, as has been done for other cancers (42). These also include exploring more sophisticated statistical methods for developing polygenic risk scores (43), and novel methods such as machine learning approaches for combining the effects of diverse risk factors (40). Third, there was substantial variation in the reporting of the studies in this review. Encouraging the use of reporting guidelines, such as the Genetic Risk Prediction Studies statement (44, 45) that includes a checklist of 25 items, would improve the transparency, quality, and completeness of the reporting of new models and facilitate future syntheses in this field.
Finally, the assessment of model performance is only one component when considering whether risk models are ready for clinical use; the context in which the model will be used, including the costs of measuring additional risk factors and the risk benefit of any interventions offered, and the wider ethical, legal, and social issues around implementation must also be considered. To our knowledge, only one study has modeled the potential impact of colorectal cancer screening based on age and SNPs on preventing deaths from colorectal cancer (11). Using age-specific crude rate of deaths due to colorectal cancer in a hypothetical population based on the Australian population in 2011 and assuming a 100% attendance rate at screening, that study showed that the net effect of inviting individuals for biennial FOBT based on their genetic risk would be 0.4% more colorectal cancer–related deaths and 0.2% more years of life lost per person invited to screen than inviting those ages between 50 and 74 years, against a background of 4.9% fewer screens, resulting in a 3.1% overall improved efficiency. The risk model used in that study was the model by Jenkins and colleagues (2016) that includes 45 SNPs and had an AUROC of 0.63 in a simulated population (46). It is likely, therefore, that similar improvements in efficiency would be seen with other models, many of which have reported AUROCs of greater than 0.63. However, that study did not consider the costs of implementing stratified screening, competing risks of death, or the psychologic harms associated with screening, uniform attendance across risk groups was assumed, and no data were included on the calibration of the model. Further modeling studies are therefore needed to assess the cost effectiveness and differences in quality adjusted life years, and implementation studies to assess risk-appropriate screening participation and the psychosocial consequences of this approach.
By identifying the published risk models for colorectal cancer that include common genetic variants and demonstrating the potential public health benefits of using such models to determine the starting age for screening, this study provides valuable evidence to support investment in this further research.
Disclosure of Potential Conflicts of Interest
J.D. Emery reports receiving a commercial research grant from Genetype. No potential conflicts of interest were disclosed by the other authors.
Disclaimer
All researchers were independent of the funding body, and the study sponsors and funder had no role in study design; data collection, analysis and interpretation of data; the writing of the report; or decision to submit the article for publication.
Acknowledgments
The authors thank Isla Kuhn for her help developing the search strategy, Zhirong Yang for help with translation, Richard Miller for helpful comments on the initial analysis, and our patient and public representative, Margaret Johnson, for her valuable contributions. This work was funded by a grant from Bowel Cancer UK (18PG0008). J.A. Usher-Smith is funded by a Cancer Research UK Prevention Fellowship (C55650/A21464). The University of Cambridge has received salary support in respect of S.J. Griffin from the NHS in the East of England through the Clinical Academic Reserve. A.C. Antoniou is supported by Cancer Research-UK (C12292/A20861). J.D. Emery is supported by an NHMRC Practitioner Fellowship.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.