Bioinformatics Seminars

Bioinformatics Seminar

Time:
Venue: Na

2 July 2019

Na

Impact of gene annotation choice on the quantification of RNA-seq data

David Chisanga
WEHI Bioinformatics

RNA sequencing is currently the method of choice for genome-wide profiling of gene expression. A popular approach to quantify expression levels of genes from using RNA-seq data is to map reads to a reference genome and then count mapped reads to each gene. Gene annotation data ; which include chromosomal coordinates of exons for tens of thousands of genes ; are required for this quantification process. For human and mouse genomes ; there are several major sources of gene annotations that can be used for quantification ; such as Ensembl ; GENCODE ; UCSC ; and RefSeq databases. However ; there is very little understanding of the effect that the choice of annotation has on the quantification of gene expression in a RNA-seq pipeline. In this talk ; I will present results from our comparison of Ensembl and RefSeq human annotations on their impact on gene expression quantification using benchmark RNA-seq data generated by the SEquencing Quality Control (SEQC/MAQC III) consortium. We found that the use of RefSeq gene annotation led to better quantification accuracy ; based on the correlation with ground truth such as expression data from >800 real-time PCR validated genes.


Search past seminars