Amazon Simple Storage Service (Amazon S3) is a cloud object storage service that offers industry-leading scalability, data availability, security, and performance. Amazon S3 can be used to store and retrieve any amount of data at any time, from anywhere.
To use AWS S3, you’ll need an AWS account, Access Key, Secret Key, and the AWS Command Line Interface (AWS CLI) installed.
👉 There are AWS S3 hosted data sets (e.g., common reference genomes) that do not require an AWS account in order to access.
Example 1 (no authentication): Downloading common reference genomes from AWS
2. Copy Bucket URL and Sync command
> module load EBModules
> module load awscli
> aws s3 --no-sign-request --region xxxx sync s3://xxxx destination_folder
Example 2 (authentication): Download data from a collaborator/personal S3 bucket
👉 This assumes you already have an AWS account.
If you are not already in possession of access keys (
access key ID and
secret access key), create them using the
IAM console. Follow the instructions
here.
> aws configure
AWS Access Key ID [None]: xxxxxxxxx
AWS Secret Access Key [None]: xxxxxxx
Default region name [None]:
Default output format [None]:
This will create 2 files in your home directory: ~/.aws/credentials
and ~/.aws/config
. This will create a default
profile:
> aws configure list-profiles default
If you want to maintain multiple profiles, you can edit the credentials
and config
file and rename default
to a new value.
Alternatively, you can use aws configure set
:
> aws configure set aws_access_key_id xxxxxxxx --profile test-profile
> aws configure set aws_secret_access_key xxxxxxxx --profile test-profile
> aws configure list-profiles
> test-profile
> aws s3 ls --recursive --summarize --human-readable # list endpoint contents
> aws s3 cp s3://xxxxx/filename . --profile test-profile # copy file
Robert Petkus, 8/1/23