Bioinformatics Seminars

Bioinformatics Seminar

Time: 11AM
Venue: Davis Auditorium and Slido

12 April 2022

Revealing structural variants in transcriptome sequencing data

Nadia Davidson
WEHI Bioinformatics/Blood Cells and Blood Cancer

Structural rearrangements of the genome can disrupt or modify gene function and have been implicated as the causal event in disease. In cancer, they are important diagnostic markers, targets for therapy and reveal the biology which drives the disease. Although structural variants can be identified with whole genome sequencing (WGS), their impact may be seen in RNA as altered transcript sequences. Fusion genes, which are estimated to drive around one in six cancers, are now routinely identified through transcriptome sequencing. Although numerous computational methods have been developed for fusion finding with RNA-Seq, methods to identify other types of transcribed structural variants, and methods for new long read transcriptome sequencing are lacking. In this presentation I will describe two recently published contributions that fill this gap: MINTIE and JAFFAL. MINTIE is a pipeline to identify aberrant transcripts from next generation RNA-seq data. It uses a reference-free approach, combining de novo assembly of transcripts with differential expression analysis to find up-regulated novel variants in a case sample. Compared with eight other approaches, it was the only method to detect a broad range of variants, such as large tandem duplications, fusions to intergenic regions and novel splicing. Applied to pediatric leukemia and rare disease cohorts, it detected novel variants which were likely to drive disease. JAFFAL is a fusion finder designed for long read transcriptome sequencing from Nanopore or PacBio. I will describe how JAFFAL overcomes the noise in long read sequencing to achieve accuracy that is better than the alternative long read method. Per base sequenced, we show that fusions can be found in long reads with sensitivity similar to short reads. However, long reads provide resolution beyond what is offered by short reads. To demonstrate this, we applied JAFFAL to long read single-cell data from cell lines and identified a complex multi-isoform fusion spanning three genes in individual cells.

Search past seminars