Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ansible + Kubernetes: how to wait for a Job completion

Thanks in advance for your time that you spent reading this.

I'm playing with Kubernetes and use Ansible for any interactions with my cluster. Have some playbooks that successfully deploy applications.

My main ansible component I use for deployment is k8s that allow me to apply my yaml configs.

I can successfully wait until deployment completes using

k8s:
    state: present
    definition: config.yaml
    wait: yes
    wait_timeout: 10

But, unfortunately, the same trick doesn't work by default with Kubernetes Jobs. The module simply exits immediately that is clearly described in ansible module, that's true:

For resource kinds without an implementation, wait returns immediately unless wait_condition is set.

To cover such a case, module spec suggests to specify

wait_condition:
  reason: REASON
  type: TYPE
  status: STATUS

The doc also says:

The possible types for a condition are specific to each resource type in Kubernetes. See the API documentation of the status field for a given resource to see possible choices.

I checked API specification and found the same as stated in the following answer:

the only type values are “Complete” and “Failed”, and that they may have a ”True” or ”False” status

So, my QUESTION is simple: is there anyone who know how to use this wait_condition properly? Did you try it already (as for now, it's relatively new feature)?

Any ideas where to look are also appreciated.

UPDATE:

That's a kind of workaround I use now:

- name: Run Job
  k8s:
   state: present
   definition: job_definition.yml

- name: Wait Until Job Is Done
  k8s_facts:
    name: job_name
    kind: Job
  register: job_status
  until: job_status.resources[0].status.active != 1
  retries: 10
  delay: 10
  ignore_errors: yes

- name: Get Final Job Status
  k8s_facts:
    name: job_name
    kind: Job
  register: job_status

- fail:
    msg: "Job Has Been Failed!"
  when: job_status.resources[0].status.failed == 1

But would be better to use the proper module feature directly.

like image 419
Konstantin Dobroliubov Avatar asked Aug 09 '19 16:08

Konstantin Dobroliubov


People also ask

How do you check if Kubernetes job is completed?

To view completed Pods of a Job, use kubectl get pods . Here, the selector is the same as the selector for the Job. The --output=jsonpath option specifies an expression with the name from each Pod in the returned list.

How do you stop a job in Kubernetes?

Not really, no such mechanism exists in Kubernetes yet afaik. It's better to stream log with something like Fluentd, or logspout, or Filebeat and forward the logs to an ELK or EFK stack. Save this answer.


2 Answers

(The other answers were so close that I'd edit them, but it says the edit queues are full.) The status in Job Condition is a string. In YAML a True tag is resolved to boolean type and you need to quote it to get the string. Like in the YAML output of the Job:

$ kubectl -n demo get job jobname -o yaml
apiVersion: batch/v1
kind: Job
metadata: ...
spec: ...
status:
  completionTime: "2021-01-19T16:24:47Z"
  conditions:
  - lastProbeTime: "2021-01-19T16:24:47Z"
    lastTransitionTime: "2021-01-19T16:24:47Z"
    status: "True"
    type: Complete
  startTime: "2021-01-19T16:24:46Z"
  succeeded: 1

Therefore to get completion you need to quote the status in wait_condition.

  k8s:
    wait: yes
    wait_condition:
      type: Complete
      status: "True"

(The wait parameter expects boolean and in YAML yes is a string, but Ansible accepts more values to boolean parameters.)

like image 150
Marko Kohtala Avatar answered Sep 28 '22 23:09

Marko Kohtala


wait_condition works for me with jobs, as long as timeout/type/status are set appropriately, based on your job average time process:

        wait: yes
        wait_timeout: 300
        wait_condition:
          type: Complete
          status: True
like image 34
flabatut Avatar answered Sep 28 '22 23:09

flabatut