Skip to content

File transfer

This document highlights several simple methods to transfer files to the HPCC home and research directories. There are two main gateway systems for copying files. 

  1. hpcc.msu.edu: This is our login gateway. While it can be used for file transfer, it's not intended for high volumes of files. More importantly, the scratch space is not mounted there and so you can't access your files on scratch.

  2. rsync.hpcc.msu.edu: It has access to scratch, and is dedicated to file transfer. Although this gateway is named by the popular Linux "rsync" command, it can be used for "sftp" or "scp" as well. Starting in October 2022, login to the rsync gateway will accept SSH keys as the ONLY authentication method. Username/password won't work. Please refer to the SSH key tutorial for setting up your keypair.

All operating systems

Note

The OnDemand portal is best for transferring files less than ~1 GB in size. For transferring larger files to and from the HPCC, see Large file transfer (Globus)

The most straightforward way to transfer files to and from the HPCC is via our OnDemand web portal. Log in with your NetID at https://ondemand.hpcc.msu.edu and click "Files" to access your different user spaces.

A screenshot of the Open OnDemand file explorer. On the top navigation bar, the Files dropdown menu shows the user's home, research, and scratch spaces.

On this page, you can upload, download, rename, and modify most files in storage locations you have access to.

Specific operating systems

Use the tabs below to view the relevant options for your system.

A number of different command-line utilities are available to OS X and Linux users. Each of them has its own advantages.

Warning

Using the commandline to connect to the rsync.hpcc.msu.edu gateway requires SSH key setup. Please refer to the SSH key tutorial for setting up your keypair.

  1. Basic file copy (scp)

    A simple command for transferring files between the cluster and another host is scp. To copy a file from a local directory to file space on the cluster, run a command such as

    1
    scp example.txt username@rsync.hpcc.msu.edu:example_copy.txt
    

    This will copy the file named example.txt in the local host's current directory to the user's home directory on the cluster, with the copy having the name example_copy.txt. Leaving the space after the colon blank gives the new file the same name as the original.  Note: To transfer a file name with spaces you must put a backslash before each space in your file name, i.e. scp "My File Name" username@hpcc.msu.edu:"My\ File\ Name".

    To copy a file from the cluster to your local directory,

    1
    scp username@rsync.hpcc.msu.edu:example.txt ./example_copy.txt
    

    will copy the file named example.txt from the user's home directory on the cluster to the home directory of the local host, naming the new file example_copy.txt. Leaving the space after the slash blank gives the new file the same name as the original. The -r option can be used to copy entire directories recursively. 

  2. Synchronize directories (rsync)

    If you are an advanced LINUX/Mac user, there is a useful utility that makes mirroring directories simple. The syntax looks very similar to scp.

    • To mirror <local_dir> on my local computer to <hpcc_dir> on hpcc, the following command can be run:

      1
      rsync -ave ssh <local_dir> username@rsync.hpcc.msu.edu:<hpcc_dir>
      

      In the above command, rsync will scan through both directories. If any files in the <local_dir> are newer, they will be uploaded to <hpcc_dir>. (It is also possible to get rsync to upload ALL different files, regardless of which is newer).

    • To mirror the HPCC directory to your local system, call

      1
      rsync -ave ssh username@rsync.hpcc.msu.edu:<hpcc_dir> <local_dir>
      
    • Please use the rsync command with the option --chmod=Dg+s to transfer files from a local computer to your research space.
      See the following example:

      1
      rsync -ave ssh TestDir --chmod=Dg+s <username>@rsync.hpcc.msu.edu:/mnt/research/<GroupName>/
      

    Note

    the first time you use rsync, you might want to add the -n flag to do a dry run before any files are copied.

  3. Interactive file copy (sftp)

    When performing several data transfers between hosts, the sftp command may be preferable, as it allows the user to work interactively. Running

    1
    sftp username@rsync.hpcc.msu.edu
    

    from a local host establishes a connection between that host and the cluster. Both hosts can be navigated. For the local file system, lcd changes to the specified directory, lpwd prints the working directory, and lls prints a list of files in the current directory. For the remote file system, the same three commands are available, minus the leading l. Also available are commands to change permissions, rename files, and manipulate directories on the remote host. The two key commands are get <file>, which copies the file in the remote working directory to the local working directory, and put <file>, which copies the file in the local working directory to the remote working directory. The quit command closes the connection between hosts.

  4. Copy files from Internet (wget)

    wget is a simple command useful for copying files from the Internet to a user's file space on the cluster.  Running the line

    1
    wget http://www.examplesite.com/examplefile.txt
    

    downloads examplefile.txt to the user's working directory.

An alternate method for transferring files on Windows is MobaXTerm. Installation and setup instructions are available here. Once you are connected to the HPCC, you can use the MobaXterm SCP interface tab to upload and download files, available on the left side of the window.

A screenshot of the MobaXTerm interface, including the scp file transfer window on the left

You can type a path at the top of the SCP interface tab to access anywhere you have permissions on the HPCC e.g. your scratch and research spaces.

MobaXTerm can also be set up for use as a SFTP client to transfer data with the rsync.hpcc.msu.edu gateway. As with other uses of the rsync.hpcc.msu.edu gateway, you must have an SSH key pair. Please refer to the SSH key tutorial for setting up your keypair.

The configuration for the SFTP session should look like this:

A screenshot of the MobaXTerm SFTP session settings window. All settings are set as default except for the ones listed in text below.

The settings should be:

  • Remote host: rsync.hpcc.msu.edu
  • Username: your HPCC username.
  • Select the Advanced Sftp settings tab.
  • Use private key: check the box. Click the small file icon to open a file browser and select your private key. Alternatively, type in the path to your private key.
  • All other settings can remain default.