In slurm, calling the command squeue -u <username>
will list all the jobs that are pending or active for a given user. I am wondering if there was a quick way to tally them all so that I know how many outstanding jobs there are, including pending and actively running jobs. Thanks!
You can see all jobs running under the account by running squeue -A account_name and then find out more information on each job by scontrol show job <jobid> .
The slurmd daemons provide fault-tolerant hierarchical communications. The user commands include: sacct, sacctmgr, salloc, sattach, sbatch, sbcast, scancel, scontrol, scrontab, sdiag, sh5util, sinfo, sprio, squeue, sreport, srun, sshare, sstat, strigger and sview. All of the commands can run anywhere in the cluster.
Please note that the hard maximum number of jobs that the SLURM scheduler can handle is 10000. It is best to limit your number of submitted jobs at any given time to less than half this amount in the case that another user also wants to submit a large number of jobs.
sbatch submits a batch script to Slurm. The batch script may be given to sbatch through a file name on the command line, or if no file name is specified, sbatch will read in a script from standard input. The batch script may contain options preceded with "#SBATCH" before any executable commands in the script.
I would interprete "quick command" differently. Additionally I would add -r for cases when you are using job arrays:
squeue -u <username> -h -t pending,running -r | wc -l
option -h removes the header "wc -l" (word count) counts the line of the output. Eventually I am using it with watch
watch 'squeue -u <username> -h -t pending,running -r | wc -l'
If you just want to summarize the output of squeue
, how about:
squeue -u <username> | awk '
BEGIN {
abbrev["R"]="(Running)"
abbrev["PD"]="(Pending)"
abbrev["CG"]="(Completing)"
abbrev["F"]="(Failed)"
}
NR>1 {a[$5]++}
END {
for (i in a) {
printf "%-2s %-12s %d\n", i, abbrev[i], a[i]
}
}'
which yields something like:
R (Running) 1
PD (Pending) 4
Explanations:
job state
is assumed to be in the 5th field according to the default format of squeue
.In order to make it handy, add the following lines to your .bash_aliases
or .bashrc
(the filename may depend on the system):
function summary() {
squeue "$@" | awk '
BEGIN {
abbrev["R"]="(Running)"
abbrev["PD"]="(Pending)"
abbrev["CG"]="(Completing)"
abbrev["F"]="(Failed)"
}
NR>1 {a[$5]++}
END {
for (i in a) {
printf "%-2s %-12s %d\n", i, abbrev[i], a[i]
}
}'
}
Then you can invoke the command just with summary [option]
, where [option]
accepts options to squeue
if needed (mostly unnecessary).
Hope this helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With