Mouse and Human versions of the MSigDB in R Format
An updated version of this repository is available at http://bioinf.wehi.edu.au/MSigDB.
Background
Molecular Signatures Database (MSigDB) is an important resource created and maintained by the Broad Institute. The gene sets contained in the MSigDB are from a wide variety of sources, and relate to a variety of species, mostly human. Our work at the WEHI predominately uses mouse models of human disease. To facilitate use of the MSigDB in our work, we have created a pure mouse version of the MSigDB by mapping all sets to mouse orthologs. A pure human version is also provided.
Procedure
- The current MSigDB v5.2 xml file was downloaded.
- Human Entrez Gene IDs were mapped to Mouse Entrez Gene IDs, using the HGNC Comparison of Orthology Predictions (HCOP) (downloaded 11 October 2016).
- Each collection was converted to a list in R, and written to a RData file using save().
Links to files containing gene sets in R format. The files can be loaded into R using the load() function. Each file contains one data object, which is a list of character vectors of Entrez Gene IDs, one vector for each gene set.
Gene Sets for Human:
- H hallmark gene sets (rdata file)
- C1 positional gene sets (rdata file)
- C2 curated gene sets (rdata file)
- C3 motif gene sets (rdata file)
- C4 computational gene sets (rdata file)
- C5 GO gene sets (rdata file)
- C6 oncogenic signatures (rdata file)
- C7 immunologic signatures (rdata file)
Gene Sets for Mouse:
- H hallmark gene sets (rdata file)
- C2 curated gene sets (rdata file)
- C3 motif gene sets (rdata file)
- C4 computational gene sets (rdata file)
- C5 GO gene sets (rdata file)
- C6 oncogenic signatures (rdata file)
- C7 immunologic signatures (rdata file)
Comments/Questions? Email Alexandra Garnham
Last Modified: 26 June 2020