Scavenger Queue

The scavenger queue allows users to run preemptible jobs on idle cores. Jobs in the scavenger queue may be interrupted if resources are required for other non-scavenger jobs.

With few exceptions, each researcher using the HPCC is limited in the number of jobs or cores they can run at one time. Annually, non-buyin users are limited in the total number of CPU and GPU hours they can use. These limits do not apply to jobs submitted to the scavenger queue.

Jobs in the scavenger queue can start on resources that would otherwise be left idle, improving research throughput. Similar to jobs submitted to the general-long queue, these jobs can request up to a 7-day wall time. The default behavior for interrupted jobs is to be re-queued, but users can opt for cancellation if it is more conducive to their workflow.

Note

We recommend that only users who can checkpoint and restart or have a workflow implemented that can manage jobs being canceled or requeued use this queue.

Usage

To use the scavenger queue, add the following line to your job script:

#SBATCH --qos=scavenger

To prevent your job from requeuing automatically if interrupted, add the following line to your job script:

#SBATCH --no-requeue

The scavenger queue is not affected by the amount of wall time requested in your job script, e.g. 24 hours wall time is treated with the same scavenger queue priority as 4 hours wall time.

Scavenger queue jobs will be automatically assigned to the scavenger account, regardless of the -A setting in your job script.

Scheduling

The scavenger queue runs using the backfill scheduler (see How Jobs are Scheduled). Job scheduling may take on the order of minutes to occur, depending on the current load on the HPCC. Scavenger queue jobs run for a minimum of 1 minute before they can be preempted, but typically scavenger queue jobs run for approximately 1 hour before preemption.