Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to determine if a job is failed

Tags:

kubernetes

How can I programatically determine if a job has failed for good and will not retry any more? I've seen the following on failed jobs:

status:
  conditions:
  - lastProbeTime: 2018-04-25T22:38:34Z
    lastTransitionTime: 2018-04-25T22:38:34Z
    message: Job has reach the specified backoff limit
    reason: BackoffLimitExceeded
    status: "True"
    type: Failed

However, the documentation doesn't explain why conditions is a list. Can there be multiple conditions? If so, which one do I rely on? Is it a guarantee that there will only be one with status: "True"?

like image 379
rcorre Avatar asked Apr 27 '18 02:04

rcorre


People also ask

How do you ask if you're still being considered for a job?

Dear [Hiring Manager's Name], I hope all is well. I just wanted to check in and see if there's an update on the timeline or status for the [job title] position I interviewed for on [date of interview]. I'm still very interested and look forward to hearing back from you.

Do employers let you know if you didn't get the job after interview?

Depending on how fast an interviewer wants to finish the hiring process, they might tell you that you are not a good fit for the job. Some of them will say it during the interview while others will just send you an email a few hours after the interview to let you know.

How long after a job interview should you hear back?

Typical waiting time after a job interview You can usually expect to hear back from the hiring company or HR department within one or two weeks after the interview, but the waiting time varies for different industries.


1 Answers

JobConditions is similar as PodConditions. You may read about PodConditions in official docs.

Anyway, To determine a successful pod, I follow another way. Let's look at it.


There are two fields in Job Spec.

One is spec.completion (default value 1), which says,

Specifies the desired number of successfully finished pods the job should be run with.

Another is spec.backoffLimit (default value 6), which says,

Specifies the number of retries before marking this job failed.


Now In JobStatus

There are two fields in JobStatus too. Succeeded and Failed. Succeeded means how many times the Pod completed successfully and Failed denotes, The number of pods which reached phase Failed.

  • Once the Success is equal or bigger than the spec.completion, the job will become completed.
  • Once the Failed is equal or bigger than the spec.backOffLimit, the job will become failed.

So, the logic will be here,

if job.Status.Succeeded >= *job.Spec.Completion {
    return "completed"
} else if job.Status.Failed >= *job.Spec.BackoffLimit {
    return "failed"
}
like image 83
Abdullah Al Maruf - Tuhin Avatar answered Oct 06 '22 00:10

Abdullah Al Maruf - Tuhin