Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slurm: Why use srun inside sbatch?

Tags:

hpc

slurm

In a sbatch script, you can directly launch programs or scripts (for example an executable file myapp) but in many tutorials people use srun myapp instead.

Despite reading some documentation on the topic, I do not understand the difference and when to use each of those syntaxes.

I hope this question is precise enough (1st question on SO), thanks in advance for your answers.

like image 876
RomualdM Avatar asked Dec 05 '18 16:12

RomualdM


People also ask

What is the difference between Srun and Sbatch?

The main difference is that srun is interactive and blocking (you get the result in your terminal and you cannot write other commands until it is finished), while sbatch is batch processing and non-blocking (results are written to a file and you can submit other commands right away).

How do you use SRUN in Slurm?

After typing your srun command and options on the command line and pressing enter, Slurm will find and then allocate the resources you specified. Depending on what you specified, it can take a few minutes for Slurm to allocate those resources. You can view all of the srun options on the Slurm documentation website.

What does SRUN hostname do?

srun is the command used to run a process on the compute nodes in the cluster. It works by passing it a command (this could be a script) which will be run on a compute node and then srun will return. srun accepts many command line options to specify the resources required by the command passed to it.

What is the difference between Mpirun and Srun?

mpirun start proxy on each node, and then start the MPI tasks. On the other hand (e.g. the MPI tasks are not directly known by the resource manager). srun directly start the MPI tasks, but that requires some support ( PMI or PMIx ) from SLURM .


1 Answers

The srun command is used to create job 'steps'.

First, it will bring better reporting of the resource usage ; the sstat command will provide real-time resource usage for processes that are started with srun, and each step (each call to srun) will be reported individually in the accounting.

Second, it can be used to setup many instances of a serial program (program that only use one CPU) into a single job, and micro-schedule those programs inside the job allocation.

Finally, for parallel jobs, srun will also play the important role of starting the parallel program and setup the parallel environment. It will start as many instances of the program as were requested with the --ntasks option on the CPUs that were allocated for the job. In the case of a MPI program, it will also handle the communication between the MPI library and Slurm.

like image 165
damienfrancois Avatar answered Oct 01 '22 12:10

damienfrancois