Bioinformatics Seminars

Bioinformatics Seminar

Time: 11AM
Venue: Hybrid

28 March 2023

Fast and accurate integration of atlas-level single cell data

Ramyar Molania
WEHI Bioinformatics

Atlas-level single cell datasets with millions of cells are valuable for many basic research and translational applications. A great example is that these data allow us to map query cells onto them in order to rapidly identify relevant cell states and phenotypes. These datasets include samples profiled across laboratories, platforms, and conditions, which can lead to the introduction of unwanted variation in the data. Accurate interpretation of these data requires reliable data normalization and integration. Numerous statistical and machine learning methods have been designed for single cell data integration; however, they all exhibit some limitations. To address the current limitations, we have developed a fast version of the RUV-III method for atlas-level single cell data integration. In this talk, I will explain how we have improved the computation of the RUV-III method and then demonstrate its performance on varieties of large single cell data including RNA-seq, CITE-seq, ATACseq, Nanostring CosMx SMI, MERSCOPE and Xenium data. We have benchmarked the RUV-III method against many single cell integration methods such as Harmony, Seurat and scVI and assess the performance of methods as removers of unwanted variation and preservers of biological variation in the data. We have used the fast RUV-III method to create several cancer and non-cancer atlas-levels single cell data including a COVID19 scRNA-seq data with 2.5 million cells. I will also show that the RUV-III method improves the integration of multi-modal single cell data.

Search past seminars