Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does the status "CG" mean in SLURM?

Tags:

jobs

slurm

On a SLURM cluster one can use squeue to get information about jobs on the system.

I know that "R" means running; and "PD" meaning pending, but what is "CG"?

I understand it to be "canceling" or "failing" from experience, but does "CG" apply when a job successfully closes? What is the G?

like image 541
khaverim Avatar asked Feb 03 '17 20:02

khaverim


People also ask

How do I know if my job is running in slurm?

You can see all jobs running under the account by running squeue -A account_name and then find out more information on each job by scontrol show job <jobid> . ReqNodeNotAvail - If you have requested a specific node and it is currently scheduled you can get this job code.

What is Squeue?

The squeue command is a tool we use to pull up information about the jobs in queue. By default, the squeue command will print out the job ID, partition, username, job status, number of nodes, and name of nodes for all jobs queued or running within Slurm.

What is a node in slurm?

Slurm, using the default node allocation plug-in, allocates nodes to jobs in exclusive mode. This means that even when all the resources within a node are not utilized by a given job, another job will not have access to these resources. Nodes possess resources such as processors, memory, swap, local disk, etc.


Video Answer


2 Answers

"CG" stands for "completing" and it happens to a job that cannot be terminated, probably because of an I/O operation.

More detailed info in the Slurm Troubleshooting Guide

like image 124
Bub Espinja Avatar answered Oct 17 '22 12:10

Bub Espinja


I found this in the 'squeue' section of the Slurm Troubleshooting Guide:

state

Job state, extended form: PENDING, RUNNING, STOPPED, SUSPENDED, CANCELLED, COMPLETING, COMPLETED, CONFIGURING, FAILED, TIMEOUT, PREEMPTED, NODE_FAIL, REVOKED and SPECIAL_EXIT. See the JOB STATE CODES section below for more information. (Valid for jobs only)

statecompact

Job state, compact form: PD (pending), R (running), CA (cancelled), CF(configuring), CG (completing), CD (completed), F (failed), TO (timeout), NF (node failure), RV (revoked) and SE (special exit state). See the JOB STATE CODES section below for more information. (Valid for jobs only)

like image 33
khaverim Avatar answered Oct 17 '22 12:10

khaverim