Skip to content

Aspera bulk file transfer

The Aspera Connect application (ascp) is a useful file transfer tool for downloading or uploading large files in bulk between the HPCC and data repository sites such as those operated by NCBI. In order to interact with a server via aspera, the remote host must be running the Aspera server.

This short tutorial will demonstrate how to load and use the command line version of Aspera to download files from the NCBI ftp site.

Step 1: Log onto HPCC rsync gateway node:

1
ssh -XY netid@rsync.hpcc.msu.edu

Step 2: Load Aspera 3.9.8 module:

1
module load Aspera-Connect/3.9.8

You can only execute Aspera file transfers from a gateway node. Transfers on the dev-nodes will not work.

Tip

If you need a higer version than 3.9.8, you can try installing it with conda (conda install -c rpetit3 aspera-connect). Conda will handle Glibc issues.

Example command for downloading data from NCBI:

1
ascp -T -k 1 -i ~/.aspera/connect/etc/asperaweb_id_dsa.openssh anonftp@ftp.ncbi.nlm.nih.gov:/refseq/uniprotkb ~/NCBI_data

For uploading files from the HPCC, please refer to the NCBI instructions for uploading SRA files.