Mouse and Human versions of the MSigDB in R Format

An updated version of this repository is available at


Molecular Signatures Database (MSigDB) is an important resource created and maintained by the Broad Institute. The gene sets contained in the MSigDB are from a wide variety of sources, and relate to a variety of species, mostly human. Our work at the WEHI predominately uses mouse models of human disease. To facilitate use of the MSigDB in our work, we have created a pure mouse version of the MSigDB by mapping all sets to mouse orthologs. A pure human version is also provided.


  1. The current MSigDB v5.2 xml file was downloaded.
  2. Human Entrez Gene IDs were mapped to Mouse Entrez Gene IDs, using the HGNC Comparison of Orthology Predictions (HCOP) (downloaded 11 October 2016).
  3. Each collection was converted to a list in R, and written to a RData file using save().

Links to files containing gene sets in R format. The files can be loaded into R using the load() function. Each file contains one data object, which is a list of character vectors of Entrez Gene IDs, one vector for each gene set.

Gene Sets for Human:

Gene Sets for Mouse:

Comments/Questions? Email Alexandra Garnham

Last Modified: 26 June 2020