Abstract
A computational framework termed ExPecto may enable in silico prediction of disease risk from DNA sequence.
Major finding: A computational framework termed ExPecto may enable in silico prediction of disease risk from DNA sequence.
Concept: ExPecto facilitates tissue-specific prediction of the effects of rare variants on gene expression.
Impact: ExPecto may allow large-scale prediction of disease risk from rare variants to aid precision medicine.
Understanding the effects of genomic alterations is essential for precision medicine. However, the vast range of genomic variations makes it impractical to experimentally determine the effects on disease at the desired scale. Further, currently available predictive models based on matched expression and genotypic data are limited to frequently observed mutations and specific cell or tissue types. To address these challenges, Zhou and colleagues developed ExPecto, a quantitative model that predicts the expression level of genes from sequence information. This approach used a deep convolutional neural network trained to predict 2,002 different histone mark, transcription factor, and DNA accessibility profiles from more than 200 cell and tissue types, facilitating tissue-specific prediction of the epigenomic effects of genomic variants. The ExPecto framework allows prediction of the effects of genomic variation on gene expression from sequence data alone, without training on epigenomic or genomic variant data. The gene expression patterns predicted by ExPecto were highly correlated with transcriptomic data generated by RNA sequencing across tissue types. ExPecto prioritized putative disease-associated variants identified in genome-wide association studies, and the results of 4 top hits in immune-related diseases were experimentally validated. Comparison of ExPecto predictions with data from the Human Gene Mutation Database suggested that ExPecto may be used for large-scale prediction of disease risk. The majority of predicted disease mutations were associated with strong decreases in gene expression, although mutations resulting in overexpression of TERT were also identified, consistent with previously reported TERT promoter mutations in cancer. These findings demonstrate that ExPecto may facilitate in silico prediction of the effects of cancer-associated mutations on gene expression (including rare variants), providing a means to predict the clinical relevance of mutations in a tissue-specific manner at a scale that is not currently feasible experimentally.
Note: Research Watch is written by Cancer Discovery editorial staff. Readers are encouraged to consult the original articles for full details. For more Research Watch, visit Cancer Discovery online at http://cancerdiscovery.aacrjournals.org/CDNews.