Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check which tasks are still running in a SLURM batch job?

Tags:

slurm

When scheduling a batch job in SLURM, e.g.

sbatch -N 10 batch-script.sh
#!/bin/bash
#SBATCH --job-name=jobname

srun --label /usr/bin/hostname

it is possible to check which step is currently running with sacct:

       JobID    JobName  Partition    Account  AllocCPUS      State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
...
421.1        hostname                  test         10    RUNNING      0:0

But how can one check which tasks/nodes are still running in the current step and which have finished? (In this case there's only 1 task per node.) The only option I found in the docs is to set a --task-epilog command and log something when each task is done.

It would be great to see, for example, that 8 out of 10 nodes have finished their task, and node03 and node08 are still running theirs.

like image 435
Disenchanted Avatar asked Nov 02 '25 14:11

Disenchanted


1 Answers

You can see which nodes are active with the squeue command. To filter for only your jobs you can do squeue -u [yourname]. To always keep updating you can do watch -n 1 "squeue -u [yourname]".

like image 53
Maarten-vd-Sande Avatar answered Nov 04 '25 15:11

Maarten-vd-Sande



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!