shRNA-seq and CRISPR-Cas9 genetic screen analysis using edgeR

This webpage provides code and data that demonstrates how to use the edgeR package to perform a differential representation analysis of shRNA-seq and sgRNA-seq screen data, as described in Dai et al. (2014) edgeR: a versatile tool for the analysis of shRNA-seq and CRISPR-Cas9 genetic screens, F1000Research, 3:95. Please cite this article if you make use of the pipeline we describe in your research.

Note I: The processAmplicons function referred to in version 2 of our paper replaces the processHairpinReads function from version 1. Provided Note II below is satisfied, please use processAmplicons to obtain summarised counts from your fastq files.

Note II: The processAmplicons function assumes the sequences in your fastq files have a fixed structure (as per Figure 1A of our paper). It cannot be used if your shRNAs/guide RNAs/sample indexes are in random locations within each read. You will need to customise your own sequence processing pipeline if this is the case, and can begin your analysis with a matrix of counts as per the Zuber and Shalem examples below.

Case study

The vignette for this case study (pdf)

The R code used in this case study

Software

You must have R (version 3.1.0 or later) installed on your computer to run this case study.

You also need to install the edgeR (version 3.8.0 or later) and limma (version 3.19.18 or later) Bioconductor packages. Type the following commands at the R command prompt to install these packages:

source("http://bioconductor.org/biocLite.R")
biocLite(c("edgeR", "limma"))

For information on our Galaxy tool see here.

Data

This case study looks at data from 6 different screens. Gzipped archives of the data from each screen can be downloaded from the following links: Screen 1 (shRNA-seq) [5.1G], Screen 2 (shRNA-seq) [2.3G], Screen 3 (shRNA-seq) [8.4G], Zuber Screen (shRNA-seq) [62Kb], Screen 4 (CRISPR-Cas9) [11G], Shalem Screen (CRISPR-Cas9) [3.3Mb].

After downloading, uncompress and save these files to your current working directory. You should then be able to run the code provided above.

Author

Matt Ritchie (mritchie at wehi dot edu dot au)