SLURM --- Changes to srun
for jobs with multiple CPUs per task
ICER has recently updated the HPCC's SLURM scheduling system to version 23.02. With this update comes a change to the behavior of srun
that affects jobs using multiple cores per task.
Take the following SLURM job script as an example. Here we request 5 tasks (or processes) where each tasks is being executed by 2 CPUs, for a total of 10 CPUs:
1 2 3 4 5 6 7 8 9 |
|
How srun
used to work
Historically, srun
would run my_job.exe
with 5 tasks and 2 CPUs per task by default. No additional command line arguments need to be specified for srun
to execute in this way.
This is because SLURM sets a number of environment variables for a job which describe the resources requested. Specifically,
- The
--ntasks
(or-n
) request is saved to theSLURM_NTASKS
variable - The
--cpus-per-task
(or-c
) request is saved to theSLURM_CPUS_PER_TASK
Then, srun
inherits values from these environment variables.
How srun
works now
As of version 22.05, srun
no longer reads the variable SLURM_CPUS_PER_TASK
.
Instead, you must now re-specify the CPUs per task when calling srun
. The final line of our example batch script should be replaced with:
1 |
|
1 |
|
The environment variable SLURM_CPUS_PER_TASK
is still available to the user,
so it is also possible to execute srun
by passing the value of this variable to the
--cpus-per-task
/-c
option:
1 |
|
Alternatively, you can set the new SRUN_CPUS_PER_TASK
environment variable:
1 2 |
|
1 2 |
|
Warning
If you are using OpenMP, setting OMP_NUM_THREADS will override both the -c
option and SRUN_CPUS_PER_TASK
.
Try it for yourself
You can test the behavior of srun
with a hybrid MPI/OpenMP example available through getexample
:
1 2 3 4 |
|
Then, request a 30 minute interactive job with 5 tasks and 2 CPUs per task:
1 |
|
Once your interactive job starts, try running:
1 2 |
|
You should see that srun hybrid
, without any additional arguments, allocated 10 CPUs to each of the 5 processes. On the other hand, srun -c hybrid
has the expected behavior of 2 CPUs for each of the 5 processes.
Now, set the SRUN_CPUS_PER_TASK
variable:
1 2 |
|
You'll see the same behavior as srun -c hybrid
.