Skip to content

Guidelines for Choosing File Storage and I/O

HOME, RESEARCH and SCRATCH are referred to as networked file systems. Each node must go through the network switch to access these spaces. The LOCAL storage options at /tmp and /mnt/local are locally accessible in the hard drive of each node and are not affected by the network. All of these are larger than RAMDISK (/dev/shm) which is located inside the node’s RAM. However, RAMDISK is the closest (and therefore fastest) storage location for files. Files stored here take up some of the node’s memory space and are counted when Slurm calculates the memory a job is using.

The table below provides detailed information about each type of storage on HPCC. ($USER is your login username and GROUP is your research group name). Please use the table below to choose which file system is best for your job. The HOME and RESEARCH systems are the only systems with automatic offsite disaster recovery protection. The SCRATCH, LOCAL, and RAMDISK systems all have automatic purge policies.

The "nodr portion of HOME/RESEARCH" column is similar to the "HOME" or "RESEARCH" columns, but refers to the portion of those directories that have been requested to move to nodr space. Note that this space is NOT protected by automatic DR protection! See the Home space or Research space pages for more information.

Primary Private files or data storage for each user Shared files or data storage for group users same as the standard HOME/RESEARCH Temporary large files or data storage for users and groups Temporary small files or data usage for job running same as LOCAL with very fast I/O
Access location Automatic login $HOME  or /mnt/home/$USER /mnt/research/GROUP /mnt/ufs18/nodr $SCRATCH or /mnt/scratch/$USER /mnt/local or /tmp (at each node) $TMPDIR (used in a job as /tmp/local/$SLURM_JOBID) /dev/shm (at each node)
Size 50GB upto 1TB, 1 million files, ($125/year for each additional TB) 50GB upto 1TB, 1 million files, ($125/year for each additional TB) as a portion of HOME or RESEARCH by user's request. No limit on the number of files 50TB and 1 million files ~400GB for intel14, ~170GB for intel16, ~400GB for intel18 Note: userIDs are restricted from consuming no more than 95% of the total available space in /tmp ½ of RAM
I/O Best Practice low I/O using  single or multiple nodes Same as HOME same as HOME or RESEARCH heavy I/O on files of large size using single or multiple nodes frequent I/O operations on many files in one node frequent and fast I/O operations on small files in one node
Careful with Watch for quota. Avoid heavy parallel I/O. Same as HOME. In addition, need to set umask or file permission so files can be shared in group. Be aware of no automatic DR protection.  Avoid frequent I/O on many small files(< 1MB), such as untarring a tar file to create many small files in a short time. Move files to HOME or RESEARCH before purge period elapses. Need to copy or move files to HOME or RESEARCH before job completes. Only local access available. Users are not able to store files in one node and gain I/O access to them from other nodes. Same as LOCAL. Request extra memory in your job script so you'll have enough space for file storage.
Command to check quota quota quota quota quota #SBATCH --tmp=20gb to reserve 20gb in $TMPDIR. ½ of the memory requested by job
Disaster Recovery Yes Yes No No No No
Purge policy No No No Yes. (Files not accessed or modified for more than 45 days may be removed) Yes (at completion of job) Yes (RAM may be reused by other jobs)