Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get the list of Pods stuck in terminating state for more than 10 mins and remove them in Ansible

I want to get the list of pods stuck in terminating state for more than 10 minutes using Ansible. Currently I am writing a script to do that but I feel there has to be a better way of doing the same. I plan to replace the describe pod command with delete one in the following code snippet.

# Command used to delete :  kubectl delete pod $PodName -n {{item}} --force --grace-period=0;
- name: get list of pods and remove the not ready ones
  shell: |
    noOfPODs=`kubectl get pods -n {{item}} | egrep "0/1|Terminating" | wc -l`;
    if [ $noOfPODs -gt 0 ];
      then
        kubectl get pods -n {{item}} | egrep "0/1|Terminating"   > {{ not_ready_pods_file }} ;
        while read line; do
          PodName=$(echo $line | awk {'print $1'})
          PodTime=$(kubectl describe pod $PodName -n {{item}} | grep Terminating | awk {'print $4'} | tr -d 'mhd)')
          if [ -z $PodTime ];
          then
            PodTime=$(echo $line | awk {'print $5'} | tr -d 'mhd')
          fi
          echo "$PodTime is PodTime"
          if [[ $PodTime == *s ]] ;
          then
            echo "PodTime in seconds"
          else
            if [ $PodTime -gt 10 ];
            then
              echo "\n$PodName" >> {{ deleted_pods_file }};
              kubectl delete pod $PodName -n {{item}} --force --grace-period=0;
            fi
          fi
        done < {{ not_ready_pods_file }}
    else
      echo 'No Pods in NOT READY or Terminating state';
    fi
  environment:
    KUBECONFIG: "./_kubeconfig/{{ env }}/kubeconfig"
  loop:
    - somenamespace
  • I tried using k8s_info in ansible but it gives a huge output which does not have time
- name: Search for all running pods
  k8s_info:
    kind: Pod
    field_selectors:
      - status.phase=Running
    kubeconfig: "./_kubeconfig/{{ env }}/kubeconfig"

Is there any better way of doing this ? like to do in Prometheus etc. Shell script will work but does not seem like the right way.

like image 265
codeaprendiz Avatar asked Jul 14 '20 06:07

codeaprendiz


People also ask

Why do pods get stuck in terminating state?

A pod is stuck in a terminating state as the configmap mounted as a volume fails to unmount when trying to clean the subPath mount for it.

How do I remove a terminated pod from Kubernetes?

First, confirm the name of the node you want to remove using kubectl get nodes , and make sure that all of the pods on the node can be safely terminated without any special procedures. Next, use the kubectl drain command to evict all user pods from the node.


1 Answers

You could leverage go-template for this and do something similar to:

kubectl get pods --all-namespaces -o go-template --template '{{range .items}}{{if eq (.status.phase) ("Terminating")}}{{if gt (.status.startTime) ("2020-07-03T04:18:02Z")}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}{{end}}'

{{if gt (.status.startTime) ("2020-07-03T04:18:02Z")}} should be replaced by your own time conditions.

like image 196
Ottovsky Avatar answered Nov 15 '22 09:11

Ottovsky