Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ansible playbook wait until all pods running

I have this ansible (working) playbook that looks at the output of kubectl get pods -o json until the pod is in the Running state. Now I want to extend this to multiple pods. The core issue is that the json result of the kubectl query is a list, I know how to access the first item, but not all of the items...

- name: wait for pods to come up
  shell: kubectl get pods -o json
  register: kubectl_get_pods
  until: kubectl_get_pods.stdout|from_json|json_query('items[0].status.phase') == "Running"
  retries: 20

The json object looks like,

[  { ...  "status": { "phase": "Running" } },
   { ...  "status": { "phase": "Running" } },
   ...
]

Using [0] to access the first item worked for handling one object in the list, but I can't figure out how to extend it to multiple items. I tried [*] which did not work.

like image 848
Martin Guthrie Avatar asked Nov 07 '18 22:11

Martin Guthrie


2 Answers

You can use kubernetes.core.k8s_info from kubernetes.core collection

For example, wait for cert-manager to be up in the cert-manager namespace:

- name: Wait until cert-manager is up
  kubernetes.core.k8s_info:
    kubeconfig: "{{ kubeconfig }}"
    api_version: v1
    kind: Pod
    namespace: cert-manager
  register: pod_list
  until: pod_list|json_query('resources[*].status.phase')|unique == ["Running"]
like image 145
Xavi Martínez Avatar answered Sep 20 '22 19:09

Xavi Martínez


The kubectl wait command

Kubernetes introduced the kubectl wait in v1.11 version:

CHANGELOG-1.11:

  • kubectl wait is a new command that allows waiting for one or more resources to be deleted or to reach a specific condition. It adds a kubectl wait --for=[delete|condition=condition-name] resource/string command.

CHANGELOG-1.13:

  • kubectl wait now supports condition value checks other than true using --for condition=available=false

CHANGELOG-1.14:

  • Expanded kubectl wait to work with more types of selectors.
  • kubectl wait command now supports the --all flag to select all resources in the namespace of the specified resource types.

It is not intended to wait for phases, but for conditions. I think that waiting for conditions is much more assertive than waiting for phases. See the following conditions:

  • PodScheduled: the Pod has been scheduled to a node;
  • Ready: the Pod is able to serve requests and should be added to the load balancing pools of all matching Services;
  • Initialized: all init containers have started successfully;
  • ContainersReady: all containers in the Pod are ready.

Using kubectl wait with Ansible

Suppose that you are automating a Kubernetes install with kubeadm + Ansible, and need to wait for the installation to complete:

- name: Wait for all control-plane pods become created
  shell: "kubectl get po --namespace=kube-system --selector tier=control-plane --output=jsonpath='{.items[*].metadata.name}'"
  register: control_plane_pods_created
  until: item in control_plane_pods_created.stdout
  retries: 10
  delay: 30
  with_items:
    - etcd
    - kube-apiserver
    - kube-controller-manager
    - kube-scheduler

- name: Wait for control-plane pods become ready
  shell: "kubectl wait --namespace=kube-system --for=condition=Ready pods --selector tier=control-plane --timeout=600s"
  register: control_plane_pods_ready

- debug: var=control_plane_pods_ready.stdout_lines

Result Example:

TASK [Wait for all control-plane pods become created] ******************************
FAILED - RETRYING: Wait all control-plane pods become created (10 retries left).
FAILED - RETRYING: Wait all control-plane pods become created (9 retries left).
FAILED - RETRYING: Wait all control-plane pods become created (8 retries left).
changed: [localhost -> localhost] => (item=etcd)
changed: [localhost -> localhost] => (item=kube-apiserver)
changed: [localhost -> localhost] => (item=kube-controller-manager)
changed: [localhost -> localhost] => (item=kube-scheduler)

TASK [Wait for control-plane pods become ready] ********************************
changed: [localhost -> localhost]

TASK [debug] *******************************************************************
ok: [localhost] => {
    "control_plane_pods_ready.stdout_lines": [
        "pod/etcd-localhost.localdomain condition met", 
        "pod/kube-apiserver-localhost.localdomain condition met", 
        "pod/kube-controller-manager-localhost.localdomain condition met", 
        "pod/kube-scheduler-localhost.localdomain condition met"
    ]    
}
like image 33
Eduardo Baitello Avatar answered Sep 20 '22 19:09

Eduardo Baitello