SLURM --- Changes to srun
for jobs with multiple CPUs per task
Note
As of ICER's update of the HPCC's SLURM scheduling system to version 24.05, srun has reverted to its behavior prior to the 23.02 update.
ICER has recently updated the HPCC's SLURM scheduling system to version 23.02. With this update comes a change to the behavior of srun
that affects jobs using multiple cores per task.
Take the following SLURM job script as an example. Here we request 5 tasks (or processes) where each tasks is being executed by 2 CPUs, for a total of 10 CPUs:
1 2 3 4 5 6 7 8 9 |
|
How srun
used to work
Historically, srun
would run my_job.exe
with 5 tasks and 2 CPUs per task by default. No additional command line arguments need to be specified for srun
to execute in this way.
This is because SLURM sets a number of environment variables for a job which describe the resources requested. Specifically,
- The
--ntasks
(or-n
) request is saved to theSLURM_NTASKS
variable - The
--cpus-per-task
(or-c
) request is saved to theSLURM_CPUS_PER_TASK
Then, srun
inherits values from these environment variables.
How srun
works now
As of version 22.05, srun
no longer reads the variable SLURM_CPUS_PER_TASK
.
Instead, you must now re-specify the CPUs per task when calling srun
. The final line of our example batch script should be replaced with:
1 |
|
1 |
|
The environment variable SLURM_CPUS_PER_TASK
is still available to the user,
so it is also possible to execute srun
by passing the value of this variable to the
--cpus-per-task
/-c
option:
1 |
|
Alternatively, you can set the new SRUN_CPUS_PER_TASK
environment variable:
1 2 |
|
1 2 |
|
Warning
If you are using OpenMP, setting OMP_NUM_THREADS will override both the -c
option and SRUN_CPUS_PER_TASK
.
Try it for yourself
You can test the behavior of srun
with a hybrid MPI/OpenMP example available through getexample
:
1 2 3 4 |
|
Then, request a 30 minute interactive job with 5 tasks and 2 CPUs per task:
1 |
|
Once your interactive job starts, try running:
1 2 |
|
You should see that srun hybrid
, without any additional arguments, allocated 10 CPUs to each of the 5 processes. On the other hand, srun -c hybrid
has the expected behavior of 2 CPUs for each of the 5 processes.
Now, set the SRUN_CPUS_PER_TASK
variable:
1 2 |
|
You'll see the same behavior as srun -c hybrid
.