Bioinformatics Seminar
Time: 11AM
Venue: Davis Auditorium and Online
11 March 2025
High-accuracy RNA integrity definition for unbiased transcriptome comparisons with INDEGRA
Nikolay ShirokikhAustralian National University
RNA carries immense amounts of information about the cells' state. However, variability in RNA integrity creates a source of biases across the samples. Therefore, the fundamental task of accurately assessing per-transcript differential abundance of RNA remains a major challenge. While methods to assess RNA degradation have been used, they either rely on amplification (TIN), or do not provide a balanced transcriptome-wide measure (RIN, DV200). Direct RNA Sequencing (DRS) data provide a unique opportunity to investigate RNA molecules in their native configuration. Using DRS, we confirm that isoform-level differential gene expression can be adversely influenced by the extent of RNA degradation, leading to false discoveries. To resolve the RNA quantification encumbrances, we develop a universal transcriptome-wide RNA integrity measure, the Direct Transcriptome Integrity number (DTI). DTI considers each covered transcript and is based on accurate mathematical modelling of the RNA decay. DTI highlights inter- and intra-transcript degradation variability and isolates RNA degradation from mapping uncertainties. Using DTI in an integrated software pipeline for integrity and degradation of RNA assessment (INDEGRA, https://github.com/Arnaroo/INDEGRA), we identify and correct false discoveries across differently-degraded samples. INDEGRA is directly pluggable into the popular differential expression tools such as DESeq2, limma-voom and edgeR, and can be used to study any number of samples and conditions. Furthermore, we use Bayesian modelling to separate technical RNA degradation from the biological RNA decay, enabling direct comparisons of in vivo/in situ RNA stability transcriptome-wide in almost any setting at the steady state. INDEGRA facilitates unbiased RNA quantification, greatly streamlines comparisons of the transcriptomes derived from different sources, and completes a toolset of convenient and accurate multi-omic RNA methods.