I'm using Slurm. When I run
sinfo -Nel
it is common to see a server designated as idle
, but sometimes there is also a little asterisk near it (Like this: idle*
).
What does that mean? I couldn't find any info about that. (The server is up and running).
If the node is marked as "idle" (it's not running a job and is ready to accept a job) or "alloc" (it's running a job), slurm considers the node is "healthy". If it's marked otherwise (e.g., down or idle*), something is wrong.
DESCRIPTION. sinfo is used to view partition and node information for a system running Slurm.
It means no further job will be scheduled on that node, but the currently running jobs will keep running (by contrast with setting the node down which kills all jobs running on the node). Nodes are often set to that state so that some maintenance operation can take place once all running jobs are finished.
When an *
appears after the state of a node it means that the node is unreachable
Quoting the sinfo manpage under the NODE STATE CODES
section:
* The node is presently not responding and will not be allocated any new work. If the node remains non-responsive, it will be placed in the DOWN state (except in the case of COMPLETING, DRAINED, DRAINING, FAIL, FAILING nodes).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With