Bioinformatics Seminars

Bioinformatics Seminar

Time: 11AM
Venue: Hybrid

18 April 2023

Simulation of single cell data incorporating biological and batch effects

Jianan Wang
WEHI Bioinformatics

With development of single cell technology, numerous cell samples with various biological signals have been sequenced by different platforms. This requires us to choose an efficient integration tool to merge those cells, and computational simulators are able to help benchmark and assess the performance of integration methods. Although existing single cell simulators can simulate library size, biological and batch effect separately, they currently do not capture associations among these three factors. Here we present GLMsim, the first single cell simulator to simultaneously capture the library size, biology and unwanted variation and their associations via generalized linear model, and simulate data resembling to the original real data. GLMsim is a robust simulator that has multiple applications in practice. First, it is capable of quantitively benchmarking different integration methods, and assessing their abilities to retain biology and remove batch effects. In addition to benchmarking, GLMsim also enables us to check the assumptions of a model or method. GLMsim can simulate data under a variety of scenarios and check if the method works on those datasets. Moreover, GLMsim can also be used to provide guidance for differential expression analyses after integrating multiple single cell datasets. GLMsim is a powerful simulator for generating faithful synthetic data that can help compare single cell methods and support developers to examine their methods.


Search past seminars