Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Easy way to hold/release jobs by job array task id in slurm

Tags:

slurm

I have a bunch of job arrays that are running right now (SLURM).

For example, 2552376_1, 2552376_10, 2552376_20, 2552376_80, 2552377_1, 2552377_10, 2552377_20, 2552377_80 and so on.

Currently, I am interested in that which end with _1.

Is there any way to hold all others without specifying job ids (because I have several hundreds of them)? The following command works for holding all the jobs:

squeue -r -t PD -u $USER -o "scontrol hold %i" | tail -n +2 | sh

For releasing the one with needed id I use

squeue -r -u $USER -o "scontrol release %i" | tail -n +2 | grep "_1$" | sh

which picks correct jobs.

like image 602
lizaveta Avatar asked Jan 27 '26 00:01

lizaveta


1 Answers

Mass update of jobs can be done by abusing the output formatting of squeue:

Hold all your pending jobs:

squeue -r -t PD -u $USER -o "scontrol hold %i" | sh

then release all your jobs ending in _1

squeue -r -t PD -u $USER -o "scontrol release %i" | grep "_1$" | sh

First run the commands without the | sh part to make sure it is working the way intended.

Note the -r option to display one job array element per line.

like image 130
damienfrancois Avatar answered Feb 01 '26 06:02

damienfrancois