Skip to content

Guidelines for Choosing File Storage and I/O

HOME, RESEARCH and SCRATCH are referred to as networked file systems. Each node must go through the network switch to access these spaces. /tmp and /mnt/local are locally accessible in the hard drive of each node. The space is not affected by the network and has larger size compared with the RAMDISK /dev/shm which is located inside the node’s RAM. However, /dev/shm is the closest storage location for files. Files stored here take up some of the node’s memory space.

The table below provides detailed information about each type of storage on HPCC. ($USER is your login username and GROUP is your research group name). Please use the table below to choose which file system is best for your job. The two columns from the left are the locations with system automatic backup. The three columns from the right are the locations with system automatic purge. The column in between shows the location where it is often considered and treated by users who requested it as the same as HOME or RESEARCH space,  but it is NOT automatically backed up! 

HOME RESEARCH nodr portion of HOME/RESEARCH SCRATCH LOCAL RAMDISK
Primary Private files or data storage for each user Shared files or data storage for group users same as the standard HOME/RESEARCH Temporary large files or data storage for users and groups Temporary small files or data usage for job running same as LOCAL with very fast I/O
Access location Automatic login $HOME  or /mnt/home/$USER /mnt/research/GROUP /mnt/ufs18/nodr $SCRATCH or /mnt/scratch/$USER /mnt/local or /tmp (at each node) $TMPDIR (used in a job as /tmp/local/$SLURM_JOBID) /dev/shm (at each node)
Size 50GB upto 1TB, 1 million files, ($125/year for each additional TB) upto 1TB and 1 million files ($125/year for each additional TB) as a portion of HOME or RESEARCH by user's request. No limit on the number of files 50TB and 1 million files ~400GB for intel14, ~170GB for intel16, ~400GB for intel18 ½ of RAM
Command to check quota quota quota quota quota #SBATCH --tmp=20gb to reserve 20gb in $TMPDIR. ½ of the memory requested by job
Backup Yes Yes No No No No
Purge policy No No No Yes. (Files not accessed or modified for more than 45 days may be removed) Yes (at completion of job) Yes (RAM may be reused by other jobs)
I/O Best Practice low I/O using  single or multiple nodes Same as HOME same as HOME or RESEARCH heavy I/O on files of large size using single or multiple nodes frequent I/O operations on many files in one node frequent and fast I/O operations on small files in one node
Careful with Watch for quota. Avoid heavy parallel I/O. Same as HOME. In addition, need to set umask or file permission so files can be shared in group. Be aware of no automatic backup. May need to do backup manually by user.  Avoid frequent I/O on many small files(< 1MB), such as untarring a tar file to create many small files in a short time. Move files to HOME or RESEARCH before purge period elapses. Need to copy or move files to HOME or RESEARCH before job completes. Only local access available. Users are not able to store files in one node and gain I/O access to them from other nodes. Same as LOCAL. Request extra memory in your job script so you'll have enough space for file storage.