Amin Al Olama and colleagues investigated the predictive ability of a polygenic risk score and observed that the risk of men in the top 1% of the distribution was 30.6-fold compared with men in the bottom 1% and 4.2-fold compared with the median risk (1). The authors conclude that “genetic risk profiling using SNPs could be useful in defining men at high risk for the disease for targeted prevention and screening programs.” Yet, such conclusion warrants a formal assessment of calibration and discriminative ability.
First, assessment of calibration is essential because the reported risks were not based on empirical observations but calculated from a risk model that was built assuming multiplicative effects between the SNPs. The authors verified whether allelic effects within each SNP could be considered as multiplicative, but not whether multiplicativity between SNPs could be assumed. Multiplicative models are known to under- and overestimate risks at the extremes of the risk distribution, especially when they include a large number of SNPs (2, 3). Although the authors mentioned that “the predicted ORs for the top 1% and the bottom 1% of the population, based on a log-linear model, did not differ from that observed,” this needs to be evidenced by a formal calibration analysis of the entire risk distribution and of the extremes if these are of special interest.
Second, the discriminative ability of the model should be assessed by examining how well the predicted risks distinguish between men who did or did not develop prostate cancer, quantified by the area under the receiver operating characteristic curve (AUC), to compare its performance with other models. Using the SNP data reported in their Table 2 and applying a validated simulation algorithm (4), we estimated that the AUC of the polygenic risk score would be 0.64. If confirmed by their data, this AUC would be lower than other models, including the prediction model from the Prostate Cancer Prevention Trial, which AUC was 0.66 for any prostate cancer and 0.71 for clinical significant prostate cancer (5).
Finally, the predictive performance is generally highest in the population in which the prediction model is developed, because the coefficients of the model are fitted to the data. The researchers have enough data to split their sample in two and perform both the development and validation analyses in one study. Independent validation of both calibration and discrimination will likely lead to a more modest perspective of the predictive performance of the polygenic risk score.
See the Response, p. 223
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Grant Support
This work was supported by a consolidator grant from the European Research Council (GENOMICMEDICINE).