js
The SLURM command sacct can be used to show the job steps of a job and the resource usages after it finished running.:
sacct -j 40410
JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
40410 job classres-+ classres 28 COMPLETED 0:0
40410.batch batch classres 28 COMPLETED 0:0
40410.extern extern classres 28 COMPLETED 0:0
40410.0 pw.x classres 28 COMPLETED 0:0
40410.1 ph.x classres 28 COMPLETED 0:0
However, to display your desired results, it might take you some time to look into the
web site and learn how to use the command. Here we
introduce the powertools command "js
" to display the resource usages of your
jobs.
Display Usage Info of a Job
Users can simply run the powertools command "js
" and it gives
you most of the useful resource usages. To see the resource usages of a
job, just use the command "js -j <JobID>
", e.g.,
$ js -j 45251 # powertools command
SLURM Job ID: 45251
WrkDir=/mnt/home/changc81/GetExample/GaAs
stdout=/mnt/home/changc81/GetExample/GaAs/slurm-45251.out
=========================================================================================================
JobID | 45251 | 45251.batch | 45251.extern | 45251.0 |
JobName | job | batch | extern | pw.x |
User | UserName | | | |
NodeList | lac-421 | lac-421 | lac-421 | lac-421 |
NNodes | 1 | 1 | 1 | 1 |
NTasks | | 1 | 1 | 28 |
NCPUS | 28 | 28 | 28 | 28 |
ReqMem | 112Gn | 112Gn | 112Gn | 112Gn |
Timelimit | 04:00:00 | | | |
Elapsed | 00:00:16 | 00:00:16 | 00:00:16 | 00:00:14 |
TotalCPU | 05:39.544 | 00:00.999 | 00:00.001 | 05:38.543 |
AveCPULoad | 21.2215 | 0.0624375 | 6.25e-05 | 24.1816 |
MaxRSS | | | 20K | 60296K |
MaxVMSize | | 189200K | 4184K | 605104K |
Start | 2018-08-28T20:27:40 | 2018-08-28T20:27:40 | 2018-08-28T20:27:40 | 2018-08-28T20:27:41 |
End | 2018-08-28T20:27:56 | 2018-08-28T20:27:56 | 2018-08-28T20:27:56 | 2018-08-28T20:27:55 |
ExitCode | 0:0 | 0:0 | 0:0 | 0:0 |
State | COMPLETED | COMPLETED | COMPLETED | COMPLETED |
=========================================================================================================
If you would like to show more data of a job, you can also use the specification -F:
$ js -j <Job ID> -F # powertools command
to list all stored data of the job steps.
Display a List of Jobs
If users would like to know a list of jobs submitted before, they
can use "js -z
" command. Simply provide a period of time when job
was running with -S (start time of the period) and -E (end time of the
period) options:
$ js -z -S <Start Time> -E <End Time>
and a list of the jobs with their properties and resource usages is displayed. For example, user can run the command:
$ js -z -S 2021-04-12 -E 2021-04-19
JobID JobName NNo NTas NCPU Timelimit Elapsed AveCPU MaxRSS Stat Exit Start NodeList
------------ ---------- --- ---- ---- ----------- ----------- ------- ---------- ---- ---- ------------------- -----------------
21043834 ondemand/+ 1 1 1 01:00:00 01:00:16 0.05487 467.41M TIM+ 0:0 2021-04-12T08:59:10 css-033
21127831 fi_info 1 1 2 00:05:00 00:03:19 1.18321 58191.45M COM+ 0:0 2021-04-16T10:14:33 skl-033
21158898 hello.exe 1 8 1 00:20:00 00:01:09 0.236 644.61M COM+ 0:0 2021-04-17T20:17:57 amr-133
21158916 interacti+ 1 1 1 03:00:00 02:00:04 0.77325 1244.61M COM+ 0:0 2021-04-18T20:18:29 css-033
21158973 SPAdes 1 4 8 09:30:00 09:00:04 7.254 13.20G FAI+ 1:0 2021-04-19T20:20:44 lac-421
to see a list of jobs running between April 12th 2021 and April 19th
2021. If any one of the options -S or -E is not specified, the time will
be considered as the current time of "js
" execution.
More Selections of js Command
To see all possible usages of the command, please use the specification -h:
$ js -h
js [<OPTION>]
Valid <OPTION> values are:
-a, --allusers:
Display jobs for all users. By default, only the
current user's jobs are displayed. If ran by user root
this is the default.
-A, --accounts:
Use this comma separated list of accounts to select jobs
to display. By default, all accounts are selected.
-b, --brief:
Equivalent to '--format=jobstep,state,error'.
-c, --completion: Use job completion instead of accounting data.
--delimiter:
ASCII characters used to separate the fields when
specifying the -p or -P options. The default delimiter
is a '|'. This options is ignored if -p or -P options
are not specified.
-C:
Display results in columns rather than rows. Each
column shows all data of a job step. A number can
be specified after -C for how many columns in a row.
-D, --duplicates:
If Slurm job ids are reset, some job numbers may
appear more than once referring to different jobs.
Without this option only the most recent jobs will be
displayed.
-e, --helpformat:
Print a list of fields that can be specified with the
'--format' option
-E, --endtime:
Select jobs eligible before this time. If states are
given with the -s option return jobs in this state before
this period.
--federation: Report jobs from federation if a member of a one.
-f, --file=file:
Read data from the specified file, rather than Slurm's
current accounting log file. (Only appliciable when
running the filetxt plugin.)
-F:
Display data of all fields (--format=ALL) in columns.
By default, three columns are shown in a row. See -C
to change the default column number.
-g, --gid, --group:
Use this comma separated list of gids or group names
to select jobs to display. By default, all groups are
selected.
-h, --help: Print this description of use.
-i, --nnodes=N:
Return jobs which ran on this many nodes (N = min[-max])
-I, --ncpus=N:
Return jobs which ran on this many cpus (N = min[-max])
-j, --jobs:
Format is <job(.step)>. Display information about this
job or comma-separated list of jobs. The default is all
jobs. Adding .step will display the specific job step of
that job. (A step id of 'batch' will display the
information about the batch step.)
-k, --timelimit-min:
Only send data about jobs with this timelimit.
If used with timelimit_max this will be the minimum
timelimit of the range. Default is no restriction.
-K, --timelimit-max:
Ignored by itself, but if timelimit_min is set this will
be the maximum timelimit of the range. Default is no
restriction.
--local Report information only about jobs on the local cluster.
Overrides --federation.
-l, --long:
Equivalent to specifying
'--format=jobid,jobname,partition,maxvmsize,maxvmsizenode,
maxvmsizetask,avevmsize,maxrss,maxrssnode,
maxrsstask,averss,maxpages,maxpagesnode,
maxpagestask,avepages,mincpu,mincpunode,
mincputask,avecpu,ntasks,alloccpus,elapsed,
state,exitcode,avecpufreq,reqcpufreqmin,
reqcpufreqmax,reqcpufreqgov,consumedenergy,
maxdiskread,maxdiskreadnode,maxdiskreadtask,
avediskread,maxdiskwrite,maxdiskwritenode,
maxdiskwritetask,avediskread,allocgres,reqgres
-L, --allclusters:
Display jobs ran on all clusters. By default, only jobs
ran on the cluster from where sacct is called are
displayed.
-M, --clusters:
Only send data about these clusters. Use "all" for all
clusters.
-n, --noheader:
No header will be added to the beginning of output.
The default is to print a header.
--noconvert:
Don't convert units from their original type
(e.g. 2048M won't be converted to 2G).
-N, --nodelist:
Display jobs that ran on any of these nodes,
can be one or more using a ranged string.
--name:
Display jobs that have any of these name(s).
-o, --format:
Comma separated list of fields. (use "--helpformat"
for a list of available fields).
-p, --parsable: output will be '|' delimited with a '|' at the end
-P, --parsable2: output will be '|' delimited without a '|' at the end
-q, --qos:
Only send data about jobs using these qos. Default is all.
-r, --partition:
Comma separated list of partitions to select jobs and
job steps from. The default is all partitions.
-s, --state:
Select jobs based on their current state or the state
they were in during the time period given: running (r),
completed (cd), failed (f), timeout (to), resizing (rs),
deadline (dl) and node_fail (nf).
-S, --starttime:
Select jobs eligible after this time. Default is
00:00:00 of the current day, unless '-s' is set then
the default is 'now'.
-T, --truncate:
Truncate time. So if a job started before --starttime
the start time would be truncated to --starttime.
The same for end time and --endtime.
-u, --uid, --user:
Use this comma separated list of uids or user names
to select jobs to display. By default, the running
user's uid is used.
--units=[KMGTP]:
Display values in specified unit type. Takes precedence
over --noconvert option.
--usage: Display brief usage message.
-v, --verbose:
Primarily for debugging purposes, report the state of
various variables during processing.
-V, --version: Print version.
-W, --wckeys:
Only send data about these wckeys. Default is all.
--whole-hetjob=[yes|no]:
If set to 'yes' (or not set), then information about all
the heterogeneous components will be retrieved. If set
to 'no' only the specific filtered components will be
retrieved.
-x, --associations:
Only send data about these association id. Default is all.
-X, --allocations:
Only show statistics relevant to the job allocation
itself, not taking steps into consideration.
-z: Show simple summary data only.
Note, valid start/end time formats are...
HH:MM[:SS] [AM|PM]
MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
MM/DD[/YY]-HH:MM[:SS]
YYYY-MM-DD[THH:MM[:SS]]