Bioinformatics Seminars

Bioinformatics Seminar

Time: 11AM
Venue: Zoom Webinar

23 February 2021

Building a better deep mutational scanning predictor by incorporating low-throughput mutagenesis data

James Fu
WEHI Bioinformatics

With the help of high throughput sequencing techniques, missense mutations can be easily identified by whole genome sequencing. Our knowledge, however, about how such variants can affect function of the protein as well as the organism is limited. Experimental methods like deep mutational scanning (DMS) have been shown to be useful in evaluating the impact of variants and multiple studies were focusing on building DMS data predictors. Protein property features have been previously used to create those computational DMS data, and in this study, we incorporate alanine scanning data to build Computational DMS using Alanine scanning (CDMA). The model is built by random forest algorithm and tuned with Bayesian optimization. We show that CDMA can improve the accuracy of predicted DMS data by leveraging known alanine scanning scores which has a balanced improvement on distinct mutation types. But we also find out that experimental data with lower compatibility may lead to decreased prediction accuracy, highlighting the necessity of a more complex data selection process.

Search past seminars