Bioinformatics Seminars

Bioinformatics Seminar

Time: 11AM
Venue: Davis Auditorium and Teams

22 November 2022

Differential transcript expression with edgeR: dividing out mapping ambiguity with bootstrap counts

Pedro Baldoni
WEHI Bioinformatics

A major challenge in the analysis of RNA-seq data at the transcript level is the inherent variability introduced during quantification of RNA sequencing reads due to the high levels of sequence similarity among transcripts. The quantification uncertainty of transcript-level counts, which size is intractable to measure analytically, introduces an extra level of technical variation that is difficult to estimate and compromises the differential transcript expression (DTE) analysis with standard methods. Bootstrap counts, as provided by popular quantification tools such as Salmon and kallisto, allow us to estimate the technical quantification uncertainty and account for such an effect in DTE analyses. In this talk, I will present catchSalmon and catchKallisto, two functions included in the R/Bioconductor package edgeR that estimate the extra technical overdispersion using bootstrap counts that are generated upstream by the quantification tools. I will discuss how the technical overdispersion can be effectively removed from the data via count scaling, reducing the effective count size and providing users with more powerful DTE analyses within the edgeR framework. A simulation study and a DTE analysis using real RNA-seq experiment will be presented to illustrate the benefits of accounting for quantification uncertainty via count scaling.


Search past seminars