You need to run, say, 30 srun jobs, but ensure each of the jobs is run on a node from the particular list of nodes (that have the same performance, to fairly compare timings). How would you do it? What I tried: <ul> <li><code>srun --nodelist=machineN[0-3] <some_cmd></code> : runs <code><some_cmd></code> on all the nodes simultaneously (what i need: to run <code><some_cmd></code> on one of the available nodes from the list)</li> <li><code>srun -p partition</code> seems to work, but needs a partition that contains exactly machineN[0-3], which is not always the case.</li> </ul> Ideas?

You can go the opposite direction and use the <code>--exclude</code> option of <code>sbatch</code>: <pre class="prettyprint"><code>srun --exclude=machineN[4-XX] <some_cmd> </code></pre> Then slurm will only consider nodes that are not listed in the excluded list. If the list is long and complicated, it can be saved in a file. Another option is to check whether the Slurm configuration includes ''features'' with <pre class="prettyprint"><code>sinfo --format "%20N %20f" </code></pre> If the 'features' column shows a comma-delimited list of features each node has (might be CPU family, network connection type, etc.), you can select a subset of the nodes with a specific features using <pre class="prettyprint"><code>srun --constraint=<some_feature> <some_cmd> </code></pre>

SLURM: How to run 30 jobs on particular nodes only?

1 Answers

You can go the opposite direction and use the --exclude option of sbatch:

srun --exclude=machineN[4-XX] <some_cmd>

Then slurm will only consider nodes that are not listed in the excluded list. If the list is long and complicated, it can be saved in a file.

Another option is to check whether the Slurm configuration includes ''features'' with

sinfo  --format "%20N %20f"

If the 'features' column shows a comma-delimited list of features each node has (might be CPU family, network connection type, etc.), you can select a subset of the nodes with a specific features using

srun --constraint=<some_feature> <some_cmd>

134

answered Sep 20 '22 07:09

damienfrancois

Related questions
                            
                                How can I get detailed job run info from SLURM (e.g. like that produced for "standard output" by LSF)?
                            
                                seq uses comma as decimal separator
                            
                                How to change how frequently SLURM updates the output file (stdout)?
                            
                                How to activate a specific Python environment as part of my submission to Slurm?
                            
                                Slurm server with a asterisk near the "idle"
                            
                                Is it possible to run SLURM jobs in the background using SRUN instead of SBATCH?
                            
                                Sbatch: pass job name as input argument
                            
                                SLURM display the stdout and stderr of an unfinished job
                            
                                Running a binary without a top level script in SLURM
                            
                                Submit and monitor SLURM jobs using Apache Airflow
                            
                                SLURM sacct shows 'batch' and 'extern' job names
                            
                                Running slurm script with multiple nodes, launch job steps with 1 task
                            
                                Python - Log memory usage
                            
                                How to run code in a debugging session from VS code on a remote using an interactive session?
                            
                                Changing the bash script sent to sbatch in slurm during run a bad idea?
                            
                                How to find from where a job is submitted in SLURM?
                            
                                Slurm: What is the difference for code executing under salloc vs srun
                            
                                Limit the number of running jobs in SLURM
                            
                                Use Bash variable within SLURM sbatch script
                            
                                Slurm: Why use srun inside sbatch?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

SLURM: How to run 30 jobs on particular nodes only?

Tags:

slurm

Ayrat

People also ask

1 Answers

damienfrancois

Recent Activity

Donate For Us