The SRA-Toolkit is a set of tools and libraries for accessing data in the Sequencing Read Archive (SRA) format.
fastq-dump
and fasterq-dump
are both used to extract data in the FASTQ or FASTA format from SRA-accessions.
fasterq-dump
is the successor to fastq-dump
. It is faster (surprise) and uses temporary files and multi-threading to speed up the extractions of files.
For best performance, it is recommended to use prefetch to download the accession before using fasterq-dump
. The prefetch tools downloads all necessary files and can resume interrupted downloads.
Use on CSHL HPC
From either bamdev1 or bamdev2:
Quick test:
Setup before use
1. Create an empty directory (named whatever) that will be used for your local repository.
2. Follow the instructions
here
Download data
Tips
👉 Some of the data sets are large; ensure you have enough space
👉 When changing the local repository, ensure to re-run `vdb-config`
👉 Use `vdb-validate` to test downloaded data for integrity
Robert Petkus, 9/1/23