Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Restarting pods quickly

Tags:

kubernetes

I have been experimenting with kubernetes recently, and I have been trying to test the failover in pods, by having a replication controller, in which containers crash as soon as they are used (thus causing a restart).

I have adapted the bashttpd project for this: https://github.com/Chronojam/bashttpd

(Where in I have set it up so that it serves the hostname of the container, then exits)

This works great, except the restart is far to slow for what I am trying to do, as it works for the first couple of requests, then stops for a while - then starts working again when the pods are restarted. (ideally id like to see no interruption at all when accessing the service).

I think (but not sure) that the backup delay mentioned here is to blame: https://github.com/kubernetes/kubernetes/blob/master/docs/user-guide/pod-states.md#restartpolicy

some output:

#] kubectl get pods
NAME                         READY     STATUS    RESTARTS   AGE
chronojam-blog-a23ak         1/1       Running   0          6h
chronojam-blog-abhh7         1/1       Running   0          6h
chronojam-serve-once-1cwmb   1/1       Running   7          4h
chronojam-serve-once-46jck   1/1       Running   7          4h
chronojam-serve-once-j8uyc   1/1       Running   3          4h
chronojam-serve-once-r8pi4   1/1       Running   7          4h
chronojam-serve-once-xhbkd   1/1       Running   4          4h
chronojam-serve-once-yb9hc   1/1       Running   7          4h
chronojam-tactics-is1go      1/1       Running   0          5h
chronojam-tactics-tqm8c      1/1       Running   0          5h
#] curl http://serve-once.chronojam.co.uk
<h3> chronojam-serve-once-j8uyc </h3>
#] curl http://serve-once.chronojam.co.uk
<h3> chronojam-serve-once-r8pi4 </h3>
#] curl http://serve-once.chronojam.co.uk
<h3> chronojam-serve-once-yb9hc </h3>
#] curl http://serve-once.chronojam.co.uk
<h3> chronojam-serve-once-46jck </h3>
#] curl http://serve-once.chronojam.co.uk
#] curl http://serve-once.chronojam.co.uk

You'll also note that even though there should be 2 still-healthy pods there, it stops returning after the 4th.

So my question is two fold:

1)

Can I tweak the backoff delay?

2)

Why does my service not send my request to the healthy containers?

Observations:

I think that it might be the webserver itself not being able to start serving requests that quickly, so kubernetes is reckonizing those pods as healthy, and sending requests there (but coming back with nothing because the process hasnt started?)

like image 994
Chronojam Avatar asked Dec 18 '15 22:12

Chronojam


People also ask

How do you restart all pods?

Restart Pods in Kubernetes with the rollout restart Command By running the rollout restart command. Run the rollout restart command below to restart the pods one by one without impacting the deployment ( deployment nginx-deployment ). Now run the kubectl command below to view the pods running ( get pods ).

Does kubectl apply restart pod?

Kubectl doesn't have a direct way of restarting individual Pods. Pods are meant to stay running until they're replaced as part of your deployment routine. This is usually when you release a new version of your container image.

How do I know if my pod is rebooted?

kubectl describe pod [your-pod-name] will show a Last State which gives you a high level indication. To see what happened on the pod before it restarted, use kubectl logs your-pod-name --previous . You can pipe this to a file for inspection e.g.


1 Answers

I filed an issue to document the recommended practice. I put a sketch of the approach in the issue:

https://github.com/kubernetes/kubernetes/issues/20473

  • ensure the pods have a non-zero terminationGracePeriodSeconds set
  • configure a readinessProbe on the main serving container of the pods
  • handle SIGTERM in the application: fail the readinessProbe but continue * to handle normal requests and do not exit
  • set maxUnavailable and/or maxSurge large enough to ensure enough serving instances in the Deployment API spec (available in 1.2)

Container restarts, especially when they pull images, are fairly expensive for the system. The Kubelet backs off restarts of crashing containers in order to degrade gracefully with DOSing docker, the registry, the apiserver, etc.

like image 120
briangrant Avatar answered Sep 26 '22 19:09

briangrant