Skip to content

Display Compute Nodes and Job Partitions by sinfo command

Information of Compute Nodes

If you would like to run a job with a lot of resources, it is a good idea to check available resources, such as which nodes are available as well as how many cores and how much memory is available on those nodes, so the job will not wait for too much time. Users can use SLURM command sinfo to get a list of nodes controlled by the job scheduler. Such as, running the command sinfo -N -r -l, where the specifications -N for showing nodes, -r for showing nodes only responsive to SLURM and -l for long description are used.

However, for each node, sinfo displays all possible partitions and causes repetitive information. Here, the powertools command node_status can be used to display much better results:

input
node_status  # powertools command
output
Thu Aug  7 08:58:47 AM EDT 2025

NodeName       Account         State     CPU(Load:Aloc Idl:Tot)    Mem(Aval:Tot)Mb   GPU(I:T)   Reason
----------------------------------------------------------------------------------------------------------
acm-000     deyoungbuyin        IDLE         0.00:  0  128:128      375280: 505202      N/A     
acm-001     deyoungbuyin        IDLE         0.05:  0  128:128      387945: 505202      N/A     
.......
acm-018        general       ALLOCATED     401.98:128    0:128      278763: 505202      N/A     
acm-019        general         MIXED       430.29:124    4:128      303526: 505202      N/A     
agg-000      ais-markle        MIXED         1.08: 12  180:192      643273: 763203      N/A     
agg-001       rhee-lab         MIXED         0.16:  2  190:192      662472: 763203      N/A     
.......
agg-015        general         MIXED       125.67:164   28:192      473948: 763203      N/A     
agg-016        general         MIXED        94.12:123   69:192      590532: 763203      N/A     
.......
nal-000        general       ALLOCATED       4.15:128    0:128      387031: 505170   a100(3:4)  
nal-001        general       ALLOCATED     128.12:128    0:128      387678: 505170   a100(3:4)  
.......
nvl-007        general         MIXED         2.83:  5   35: 40      178735: 376162   v100(3:8)  

intel18  =>   50.8%(buyin)   69.5%( 187)    16.5%: 20.3%( 9224)      46.6%(35.9Tb)    54%(144)   Usage%(Total)
  amd20  =>   71.8%(buyin)   83.2%( 358)    33.1%: 41.4%(49456)      56.2%( 239Tb)    74%(124)   Usage%(Total)
  amd22  =>   48.6%(buyin)   77.8%(  72)   169.1%: 58.5%( 9728)      42.4%(54.4Tb)    N/A(  0)   Usage%(Total)

Summary  =>   67.7%(buyin)   49.2%( 983)    43.6%: 35.5%(78844)      43.8%( 397Tb)    30%(564)   Usage%(Total)

The result of node_status is a good reference to find out how many nodes available for your jobs as it displays important information including node names, buyin accounts, node states, CPU cores, memory, GPU, and the reason the node is unavailable.

If you need more complete details of a particular node, you can use scontrol show node -a <node_name> command:

input
scontrol show node -a acm-019
output
NodeName=acm-019 Arch=x86_64 CoresPerSocket=16 
   CPUAlloc=124 CPUEfctv=128 CPUTot=128 CPULoad=430.29
   AvailableFeatures=acm,amd22
   ActiveFeatures=acm,amd22
   Gres=(null)
   NodeAddr=acm-019 NodeHostName=acm-019 Version=24.05.8
   OS=Linux 5.15.0-126-generic #136-Ubuntu SMP Wed Nov 6 10:38:22 UTC 2024 
   RealMemory=505202 AllocMem=457514 FreeMem=303526 Sockets=8 Boards=1
   State=MIXED ThreadsPerCore=1 TmpDisk=150000 Weight=501 Owner=N/A MCS_label=N/A
   Partitions=iceradmin,scavenger,general-short,general-long,...
   BootTime=2025-05-10T15:06:37 SlurmdStartTime=2025-08-05T10:48:04
   LastBusyTime=2025-06-14T05:03:08 ResumeAfterTime=None
   CfgTRES=cpu=128,mem=505202M,billing=76790
   AllocTRES=cpu=124,mem=457514M
   CurrentWatts=0 AveWatts=0

SLURM Partitions for Jobs

One of the important details about a node is what kind of jobs can run on it. For example, if a node is a buy-in node, only jobs with walltime equal to or less than 4 hours can run for a non-buyin users. We can check the summary of all partitions using sinfo with the -s specification:

input
sinfo -s
output
PARTITION           AVAIL  TIMELIMIT   NODES(A/I/O/T) NODELIST
scavenger              up 7-00:00:00   480/429/61/970 acm-[000-047,050-069],agg-[000-062,065-072],agx-[000-001],amr-[000-237,240-244,246-253],nal-[000-003,008-010],ncc-000,nch-[000-003],neh-[000-001],nel-[000-001],nfh-[000-004],nif-[001-005],nvf-[000-020],nvl-[000-007],skl-[000-023,025-105,107-115,120-144,148-167]
ondemand               up 7-00:00:00          4/0/0/4 amr-[184-187]
general-short          up    4:00:00   476/429/61/966 acm-[000-047,050-069],agg-[000-062,065-072],agx-[000-001],amr-[000-183,188-237,240-244,246-253],nal-[000-003,008-010],ncc-000,nch-[000-003],neh-[000-001],nel-[000-001],nfh-[000-004],nif-[001-005],nvf-[000-020],nvl-[000-007],skl-[000-023,025-105,107-115,120-144,148-167]
general-long           up 7-00:00:00    197/68/34/299 acm-[018-047,061-067],agg-[015-047],agx-[000-001],amr-[188-237,246-253],ncc-000,skl-[027-052,054-100,102-105,107-112,143-144,162-163]
general-long-bigmem    up 7-00:00:00        15/1/1/17 acm-[047,061-067],agg-049,amr-[103,246-251]
general-long-gpu       up 7-00:00:00        10/6/1/17 nal-[000-001,010],nel-001,nfh-003,nvf-[018-020],nvl-[005-007]
general-long-grace     up 7-00:00:00          0/0/1/1 ncc-000

where the list of job partitions and their setup for walltime limit and nodes are shown. More detailed information for each job partition can also be found by -p specification:

input
sinfo -p general-long -r -l
output
Thu Aug 07 09:06:08 2025
PARTITION    AVAIL  TIMELIMIT   JOB_SIZE ROOT OVERSUBS     GROUPS  NODES       STATE RESERVATION NODELIST
general-long    up 7-00:00:00 1-infinite   no       NO        all      2     drained  amd24_perf agg-[021,047]
general-long    up 7-00:00:00 1-infinite   no       NO        all      3    draining             acm-024,agg-022,amr-204
general-long    up 7-00:00:00 1-infinite   no       NO        all      4     drained             agg-036,amr-247,ncc-000,skl-102
general-long    up 7-00:00:00 1-infinite   no       NO        all      2        down  amd24_perf agx-[000-001]
general-long    up 7-00:00:00 1-infinite   no       NO        all    176       mixed             acm-[019-020,023,025-026,031-034,036-037,039,041-044,047,061-067],agg-[015-020,023-025,027-029,031-035,037-046],amr-[188-203,205-237,246,248-253],skl-[027,032,034-035,040-043,048-052,054-056,058-100,103-105,107-112,162]
general-long    up 7-00:00:00 1-infinite   no       NO        all     18   allocated             acm-[018,021-022,027-030,035,038,040,045-046],agg-[026,030],skl-[057,143-144,163]

Users can also show nodes only allowed for specific job partitions by using -N and -p:

input
sinfo -N -l -r -p general-short,general-long
output
Thu Aug 07 09:06:41 2025
NODELIST   NODES     PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              
acm-000        1 general-short        idle 128    8:16:1 505202   150000    501 acm,amd2 none                
acm-001        1 general-short        idle 128    8:16:1 505202   150000    501 acm,amd2 none                
acm-002        1 general-short        idle 128    8:16:1 505202   150000    501 acm,amd2 none                
acm-003        1 general-short        idle 128    8:16:1 505202   150000    501 acm,amd2 none                
acm-004        1 general-short       mixed 128    8:16:1 505202   150000    501 acm,amd2 none                
.......
skl-163        1 general-short   allocated 40     2:20:1 376162   150000    303 skl,inte none                
skl-164        1 general-short        idle 40     2:20:1 376162   150000    303 skl,inte none                
skl-165        1 general-short        idle 40     2:20:1 376162   150000    303 skl,inte none                
skl-166        1 general-short       mixed 40     2:20:1 376162   150000    303 skl,inte none                
skl-167        1 general-short        idle 40     2:20:1 376162   150000    303 skl,inte none                

For a complete instruction of sinfo, please refer to the SLURM web page.