SLURM --- Changes to
srun for jobs with multiple CPUs per task
ICER has recently updated the HPCC's SLURM scheduling system to version 23.02. With this update comes a change to the behavior of
srun that affects jobs using multiple cores per task.
Take the following SLURM job script as an example. Here we request 5 tasks (or processes) where each tasks is being executed by 2 CPUs, for a total of 10 CPUs:
1 2 3 4 5 6 7 8 9
srun used to work
srun would run
my_job.exe with 5 tasks and 2 CPUs per task by default. No additional command line arguments need to be specified for
srun to execute in this way.
This is because SLURM sets a number of environment variables for a job which describe the resources requested. Specifically,
-n) request is saved to the
-c) request is saved to the
srun inherits values from these environment variables.
srun works now
As of version 22.05,
srun no longer reads the variable
Instead, you must now re-specify the CPUs per task when calling
srun. The final line of our example batch script should be replaced with:
The environment variable
SLURM_CPUS_PER_TASK is still available to the user,
so it is also possible to execute
srun by passing the value of this variable to the
Alternatively, you can set the new
SRUN_CPUS_PER_TASK environment variable:
If you are using OpenMP, setting OMP_NUM_THREADS will override both the
-c option and
Try it for yourself
You can test the behavior of
srun with a hybrid MPI/OpenMP example available through
1 2 3 4
Then, request a 30 minute interactive job with 5 tasks and 2 CPUs per task:
Once your interactive job starts, try running:
You should see that
srun hybrid, without any additional arguments, allocated 10 CPUs to each of the 5 processes. On the other hand,
srun -c hybrid has the expected behavior of 2 CPUs for each of the 5 processes.
Now, set the
You'll see the same behavior as
srun -c hybrid.