Cold Spring Harbor Laboratory | Information Technology

SRA-Toolkit and fasterq-dump on the HPC cluster

Creation date: 9/1/2023 4:57 PM Updated: 9/1/2023 4:57 PM elzar hpc software

The SRA-Toolkit is a set of tools and libraries for accessing data in the Sequencing Read Archive (SRA) format.

fastq-dump and fasterq-dump are both used to extract data in the FASTQ or FASTA format from SRA-accessions.

fasterq-dump is the successor to fastq-dump. It is faster (surprise) and uses temporary files and multi-threading to speed up the extractions of files.

For best performance, it is recommended to use prefetch to download the accession before using fasterq-dump. The prefetch tools downloads all necessary files and can resume interrupted downloads.