Subread: a superfast and accurate read aligner

The Subread aligner is a general-purpose read aligner, which can be used to map reads generated from both genomic DNA sequencing and RNA sequencing technologies.

When mapping RNA-seq reads, Subread should only be used for the purpose of gene expression analysis. For other purposes (eg. detecting genomic variations) which require the full alignments of the reads, the Subjunc aligner should be used.

Download and installation

The Subread aligner is part of the Subread package. Please refer to the instructions there for the download and installation.

A quick start

Build an index for the reference genome (you may provide a single FASTA file including all the reference sequences):
subread-buildindex -o my_index chr1.fa chr2.fa ...
Map single-end reads using 5 threads:
 
subread-align -T 5 -i my_index -r reads.txt -o subread_results.sam
Detect indels of up to 16bp:
 
subread-align -I 16 -i my_index -r reads.txt -o subread_results.sam
Report up to three best mapping locations:
 
subread-align -B 3 -i my_index -r reads.txt -o subread_results.sam
Report uniquely mapped reads only:
 
subread-align -u -i my_index -r reads.txt -o subread_results.sam
Map paired-end reads:
subread-align -d 50 -D 600 -i my_index -r reads1.txt -R reads2.txt -o subread_results_PE.sam
Detect fusions in genomic DNA sequencing data:
subread-align --reportFusions -d 50 -D 600 -i my_index -r reads1.txt -R reads2.txt 
-o subread_results.sam

Citation

[1] Liao Y, Smyth GK and Shi W (2013). The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nucleic Acids Research, 41(10):e108

Users Guide

The Users Guide (0.5MB) contains a comprehensive description to this program.

Get help

The best way to get help is to send your questions to the SEQanswers mailing list or the Bioconductor mailing list. Alternatively, you may directly contact Dr. Wei Shi (shi at wehi dot edu dot au).

Scientific publications citing Subread

See the full list from Google Scholar.

Links

Subjunc: Detecting exon-exon junctions and mapping RNA-seq reads.

featureCounts: Summarizing reads to genomic features.

Rsubread: a Bioconductor R implementation of the Subread package.

A case study for analyzing RNA-seq data: Using Bioconductor packages Rsubread and Limma to perform a complete analysis for RNA-seq data.

Subread package overview: Brief description to Subread package.