Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SLURM: How to run 30 jobs on particular nodes only?

Tags:

slurm

You need to run, say, 30 srun jobs, but ensure each of the jobs is run on a node from the particular list of nodes (that have the same performance, to fairly compare timings). How would you do it?

What I tried:

  • srun --nodelist=machineN[0-3] <some_cmd> : runs <some_cmd> on all the nodes simultaneously (what i need: to run <some_cmd> on one of the available nodes from the list)

  • srun -p partition seems to work, but needs a partition that contains exactly machineN[0-3], which is not always the case.

Ideas?

like image 426
Ayrat Avatar asked May 27 '16 10:05

Ayrat


People also ask

What is a partition in Slurm?

Partitions in Slurm can be considered as a resource abstraction. A partition configuration defines job limits or access controls for a group of nodes.

What does cpus per task mean?

So in other words, a task cannot be split across multiple nodes. So using --cpus-per-task will ensure it gets allocated to the same node, while using --ntasks can and may allocate it to multiple nodes.

What is nodes in Slurm?

Nodes possess resources such as processors, memory, swap, local disk, etc. and jobs consume these resources. The exclusive use default policy in Slurm can result in inefficient utilization of the cluster and of its nodes resources.


1 Answers

You can go the opposite direction and use the --exclude option of sbatch:

srun --exclude=machineN[4-XX] <some_cmd>

Then slurm will only consider nodes that are not listed in the excluded list. If the list is long and complicated, it can be saved in a file.

Another option is to check whether the Slurm configuration includes ''features'' with

sinfo  --format "%20N %20f"

If the 'features' column shows a comma-delimited list of features each node has (might be CPU family, network connection type, etc.), you can select a subset of the nodes with a specific features using

srun --constraint=<some_feature> <some_cmd>
like image 134
damienfrancois Avatar answered Sep 20 '22 07:09

damienfrancois