Subjunc: detecting exon-exon junctions and mapping RNA-seq reads

The Subjunc aligner is an RNA-seq read aligner, specialized in detecting exon-exon junctions and performing full alignments for the reads (exon-spanning reads in particular).

For the purpose of gene expression analysis, the Subread aligner is recommended for mapping RNA-seq reads although the Subjunc aligner can be used too. The main reason for this recommendation is because Subread is much faster than Subjunc and the gene expression analysis does not require the reads to be fully mapped. For other purposes, the Subjunc aligner should be used.

Download and installation

The Subjunc aligner is part of the Subread package. Please refer to the instructions there for the download and installation.

A quick start

Build an index for the reference genome (you may provide a single FASTA file including all the reference sequences):
subread-buildindex -o my_index chr1.fa chr2.fa ...
Report uniquely mapped reads only (by default). Mapping output includes BAM files and exon-exon junctions discovered from the data.
subjunc -T 5 -i my_index -r reads1.txt -o subjunc_results.bam
Report up to three alignments for each multi-mapping read:
subjunc --multiMapping -B 3 -T 5 -i my_index -r reads1.txt -o subjunc_results.bam
Detect indel of up to 16bp:
subjunc -I 16 -i my_index -r reads1.txt -o subjunc_results.bam
Map paired-end reads and discover exon-exon junctions:
subjunc -d 50 -D 600 -i my_index -r reads1.txt -R reads2.txt -o subjunc_results.bam

Citation

[1] Liao Y, Smyth GK and Shi W (2013). The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nucleic Acids Research, 41(10):e108

Users Guide

The Users Guide (0.5MB) contains a comprehensive description to this program.

Get help

The best way to get help is to send your questions to the SEQanswers mailing list or the Bioconductor mailing list. Alternatively, you may directly contact Dr. Wei Shi (shi at wehi dot edu dot au).

Links

Subread: a general-purpose read aligner.

featureCounts: Summarizing reads to genomic features.

Rsubread: a Bioconductor R implementation of the Subread package.

A case study for analyzing RNA-seq data: Using Bioconductor packages Rsubread and Limma to perform a complete analysis for RNA-seq data.

Subread package overview: Brief description to Subread package.