Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Make Kubernetes wait for Pod termination before removing from Service endpoints

Tags:

kubernetes

According to Termination of Pods, step 7 occurs simultaneously with 3. Is there any way I can prevent this from happening and have 7 occur only after the Pod's graceful termination (or expiration of the grace period)?

The reason why I need this is that my Pod's termination routine requires my-service-X.my-namespace.svc.cluster.local to resolve to the Pod's IP during the whole process, but the corresponding Endpoint gets removed as soon as I run kubectl delete on the Pod / Deployment.

Note: In case it helps making this clear, I'm running a bunch of clustered VerneMQ (Erlang) nodes which, on termination, dump their contents to other nodes on the cluster — hence the need for the nodenames to resolve correctly during the whole termination process. Only then should the corresponding Endpoints be removed.

like image 440
Tony E. Stark Avatar asked Oct 18 '22 23:10

Tony E. Stark


2 Answers

Unfortunately kubernetes was designed to remove the Pod from the endpoints at the same time as the prestop hook is started (see link in question to kubernetes docs):

At the same time as the kubelet is starting graceful shutdown, the control plane removes that shutting-down Pod from Endpoints

This google kubernetes docs says it even more clearly:

  1. Pod is set to the “Terminating” State and removed from the endpoints list of all Services

There also was also a feature request for that. which was not recognized.

Solution for helm users

But if you are using helm, you can use hooks (e.g. pre-delete,pre-upgrade,pre-rollback). Unfortunately this helm hook is an extra pod which can not access all pod resources.

This is an example for a hook:

apiVersion: batch/v1
kind: Job
metadata:
  name: graceful-shutdown-hook
  annotations:
    "helm.sh/hook": pre-delete,pre-upgrade,pre-rollback
  labels:
    app.kubernetes.io/name: graceful-shutdown-hook
spec:
  template:
    spec:
      containers:
        - name: graceful-shutdown
          image: busybox:1.28.2
          command: ['sh', '-cx', '/bin/sleep 15']
      restartPolicy: Never
  backoffLimit: 0
like image 200
Matthias M Avatar answered Nov 15 '22 09:11

Matthias M


Maybe you should consider using headless service instead of using ClusterIP one. That way your apps will discover using the actual endpoint IPs and the removal from endpoint list will not break the availability during shutdown, but will remove from discovery (or from ie. ingress controller backends in nginx contrib)

like image 42
Radek 'Goblin' Pieczonka Avatar answered Nov 15 '22 08:11

Radek 'Goblin' Pieczonka