Users looking for example SLURM scripts for Singularity jobs should see this page.
Different kinds of executables (compiled C/C++ code, Python and R scripts) are demonstrated across these examples for variety.
How an executable should be run and the resources required depends on the details of that program.
Single node, single core
This example demonstrates a simple R script job but is suited to any serial job. The Rscript command does not necessarily have to be wrapped with the srun command because this job uses no parallelism.
#!/bin/bash --login# Job name:#SBATCH --job-name=Rscript# Number of tasks (processes)# SLURM defaults to 1 but we specify anyway#SBATCH --ntasks=1# Memory per node# Specify "M" or "G" for MB and GB respectively#SBATCH --mem=20M# Wall time# Format: "minutes", "hours:minutes:seconds", # "days-hours", or "days-hours:minutes"#SBATCH --time=01:00:00# Mail type# e.g., which events trigger email notifications#SBATCH --mail-type=ALL# Mail address#SBATCH --mail-user=yournetid@msu.edu# Standard output and error to file# %x: job name, %j: job ID#SBATCH --output=%x-%j.SLURMoutecho"This script is from ICER's example SLURM scripts"# Purge current modules and load those we require
modulepurge
moduleloadGCC/8.3.0OpenMPI/3.1.4R/4.0.2powertools
# Run our jobcd/mnt/home/user123
srunRscriptmyscript.R
# Print resource information
scontrolshowjob$SLURM_JOB_ID
js-j$SLURM_JOB_ID
Single node, single core with GPU
Jobs that use GPUs must request these resources within their SLURM script. SLURM will automatically allocate the job to a node with the appropriate GPU.
Multiple GPUs may be available per node.
Note
This example requests only a single GPU node. Users looking to use multiple nodes, each with their own GPU(s), should replace the --gpus option with --gpus-per-node. See the list of SLURM specifications for more.
The fake Python script in this example would use
PyTorch to define and train a neural network. The script loads our conda environment following the recommended setup in the conda usage guide.
#!/bin/bash --login# Job name:#SBATCH --job-name=pytorch# Number of processes.# Unless programmed using MPI,# most programs using GPU-offloading only need# a single CPU-based process to manage the device(s)#SBATCH --ntasks=1# Type and number of GPUs# The type is optional.#SBATCH --gpus=v100:4# Total CPU memory# All available memory per GPU is allocated by default.# Specify "M" or "G" for MB and GB respectively#SBATCH --mem=2G# Wall time# Format: "minutes", "hours:minutes:seconds", # "days-hours", or "days-hours:minutes"#SBATCH --time=01:00:00# Mail type# e.g., which events trigger email notifications#SBATCH --mail-type=ALL# Mail address#SBATCH --mail-user=yournetid@msu.edu# Standard output and error to file# %x: job name, %j: job ID#SBATCH --output=%x-%j.SLURMoutecho"This script is from ICER's example SLURM scripts"
moduleloadConda/3
condaactivatemyenv
# Run our jobcd/mnt/home/user123
srunpythontrain_neural_network.py
# Print resource information
scontrolshowjob$SLURM_JOB_ID
js-j$SLURM_JOB_ID
Single node, multiple cores
There are two ways a job could use multiple CPU cores (also known as processors) on a single node: each CPU could work independently, or they could work collaboratively on tasks (also called processes). The latter style is appropriate for jobs written with OpenMP. Examples of each are shown below.
Independent CPUs
The fake Python script in this example would use a pool of processes independently completing tasks. The script loads our conda environment following the recommended setup in the conda usage guide.
#!/bin/bash --login# Job name:#SBATCH --job-name=python-pool# Number of nodes#SBATCH --nodes=1# Number of tasks to run on each node#SBATCH --ntasks-per-node=16# Memory per CPU# Specify "M" or "G" for MB and GB respectively#SBATCH --mem-per-cpu=20M# Wall time# Format: "minutes", "hours:minutes:seconds", # "days-hours", or "days-hours:minutes"#SBATCH --time=01:00:00# Mail type# e.g., which events trigger email notifications#SBATCH --mail-type=ALL# Mail address#SBATCH --mail-user=yournetid@msu.edu# Standard output and error to file# %x: job name, %j: job ID#SBATCH --output=%x-%j.SLURMoutecho"This script is from ICER's example SLURM scripts"
moduleloadConda/3
condaactivatemyenv
# Run our jobcd/mnt/home/user123
srunpythonmy_processor_pool.py
# Print resource information
scontrolshowjob$SLURM_JOB_ID
js-j$SLURM_JOB_ID
Collaborative CPUs
This example is suited to software written with OpenMP, where multiple CPU cores are needed per task (also called a process).
Warning
For this kind of job, users must specify --cpus-per-task at the top of their job file and when calling srun. See the Lab Notebook on changes to srun for more.
Warning
When using srun -c around your job's executable as in this example, it is usually not necessary to specify OMP_NUM_THREADS; however,
setting OMP_NUM_THREADSwill override any options passed to srun.
#!/bin/bash --login# Job name:#SBATCH --job-name=openmp-threaded# Number of nodes#SBATCH --nodes=1# Number of tasks to run on each node#SBATCH --ntasks-per-node=1# Number of CPUs per task#SBATCH --cpus-per-task=32# Memory per Node# Specify "M" or "G" for MB and GB respectively#SBATCH --mem=2G# Wall time# Format: "minutes", "hours:minutes:seconds", # "days-hours", or "days-hours:minutes"#SBATCH --time=01:00:00# Mail type# e.g., which events trigger email notifications#SBATCH --mail-type=ALL# Mail address#SBATCH --mail-user=yournetid@msu.edu# Standard output and error to file# %x: job name, %j: job ID#SBATCH --output=%x-%j.SLURMoutecho"This script is from ICER's example SLURM scripts"# Purge current modules and load those we require
modulepurge
moduleloadGCC/10.3.0
# Run our jobcd$SCRATCH# You MUST specify the number of CPUs per task again.# Alternatively, you can set OMP_NUM_THREADS
srun-c$SLURM_CPUS_PER_TASKmy_openmp
# Print resource information
scontrolshowjob$SLURM_JOB_ID
js-j$SLURM_JOB_ID
Multiple nodes
When your required resources exceed those available on a single node,
you can request multiple nodes while specifying per-node resources.
Note
In most cases, the srun command takes the place of mpirun.
The srun command does not require an argument specifying the number
of processes to be used for the job.
#!/bin/bash --login# Job name:#SBATCH --job-name=mpi-parallel# Number of nodes#SBATCH --nodes=4# Number of tasks to run on each node#SBATCH --ntasks-per-node=32# Memory per Node# Specify "M" or "G" for MB and GB respectively#SBATCH --mem=2G# Wall time# Format: "minutes", "hours:minutes:seconds", # "days-hours", or "days-hours:minutes"#SBATCH --time=01:00:00# Mail type# e.g., which events trigger email notifications#SBATCH --mail-type=ALL# Mail address#SBATCH --mail-user=yournetid@msu.edu# Standard output and error to file# %x: job name, %j: job ID#SBATCH --output=%x-%j.SLURMoutecho"This script is from ICER's example SLURM scripts"# Purge current modules and load those we require
modulepurge
moduleloadGCC/10.3.0OpenMPI/4.1.1
# Run our jobcd$SCRATCH
srunmy_mpi_job
# Print resource information
scontrolshowjob$SLURM_JOB_ID
js-j$SLURM_JOB_ID
Multiple nodes with multiple cores per task (hybrid jobs)
Some programs can use, for example, both MPI and OpenMP to execute what
is called "hybrid" parallelism where each node runs one or more threaded processes.
Warning
For this kind of job, users must specify --cpus-per-task at the top of their job file and when calling srun. See the Lab Notebook on changes to srun for more.
Warning
When using srun -c around your job's executable as in this example, it is usually not necessary to specify OMP_NUM_THREADS; however,
setting OMP_NUM_THREADSwill override any options passed to srun.
#!/bin/bash --login# Job name:#SBATCH --job-name=mpi-hybrid# Number of nodes#SBATCH --nodes=8# Number of tasks to run on each node#SBATCH --ntasks-per-node=6# Number of CPUs per task#SBATCH --cpus-per-task=4# Memory per Node# Specify "M" or "G" for MB and GB respectively#SBATCH --mem=2G# Wall time# Format: "minutes", "hours:minutes:seconds", # "days-hours", or "days-hours:minutes"#SBATCH --time=01:00:00# Mail type# e.g., which events trigger email notifications#SBATCH --mail-type=ALL# Mail address#SBATCH --mail-user=yournetid@msu.edu# Standard output and error to file# %x: job name, %j: job ID#SBATCH --output=%x-%j.SLURMoutecho"This script is from ICER's example SLURM scripts"# Purge current modules and load those we require
modulepurge
moduleloadGCC/10.3.0OpenMPI/4.1.1
# Run our jobcd$SCRATCH# You MUST specify the number of CPUs per task again.# Alternatively, you can set OMP_NUM_THREADS
srun-c$SLURM_CPUS_PER_TASKmy_hybrid_program
# Print resource information
scontrolshowjob$SLURM_JOB_ID
js-j$SLURM_JOB_ID