I have been experimenting with kubernetes recently, and I have been trying to test the failover in pods, by having a replication controller, in which containers crash as soon as they are used (thus causing a restart).
I have adapted the bashttpd project for this: https://github.com/Chronojam/bashttpd
(Where in I have set it up so that it serves the hostname of the container, then exits)
This works great, except the restart is far to slow for what I am trying to do, as it works for the first couple of requests, then stops for a while - then starts working again when the pods are restarted. (ideally id like to see no interruption at all when accessing the service).
I think (but not sure) that the backup delay mentioned here is to blame: https://github.com/kubernetes/kubernetes/blob/master/docs/user-guide/pod-states.md#restartpolicy
some output:
#] kubectl get pods
NAME READY STATUS RESTARTS AGE
chronojam-blog-a23ak 1/1 Running 0 6h
chronojam-blog-abhh7 1/1 Running 0 6h
chronojam-serve-once-1cwmb 1/1 Running 7 4h
chronojam-serve-once-46jck 1/1 Running 7 4h
chronojam-serve-once-j8uyc 1/1 Running 3 4h
chronojam-serve-once-r8pi4 1/1 Running 7 4h
chronojam-serve-once-xhbkd 1/1 Running 4 4h
chronojam-serve-once-yb9hc 1/1 Running 7 4h
chronojam-tactics-is1go 1/1 Running 0 5h
chronojam-tactics-tqm8c 1/1 Running 0 5h
#] curl http://serve-once.chronojam.co.uk
<h3> chronojam-serve-once-j8uyc </h3>
#] curl http://serve-once.chronojam.co.uk
<h3> chronojam-serve-once-r8pi4 </h3>
#] curl http://serve-once.chronojam.co.uk
<h3> chronojam-serve-once-yb9hc </h3>
#] curl http://serve-once.chronojam.co.uk
<h3> chronojam-serve-once-46jck </h3>
#] curl http://serve-once.chronojam.co.uk
#] curl http://serve-once.chronojam.co.uk
You'll also note that even though there should be 2 still-healthy pods there, it stops returning after the 4th.
So my question is two fold:
1)
Can I tweak the backoff delay?
2)
Why does my service not send my request to the healthy containers?
I think that it might be the webserver itself not being able to start serving requests that quickly, so kubernetes is reckonizing those pods as healthy, and sending requests there (but coming back with nothing because the process hasnt started?)
Restart Pods in Kubernetes with the rollout restart Command By running the rollout restart command. Run the rollout restart command below to restart the pods one by one without impacting the deployment ( deployment nginx-deployment ). Now run the kubectl command below to view the pods running ( get pods ).
Kubectl doesn't have a direct way of restarting individual Pods. Pods are meant to stay running until they're replaced as part of your deployment routine. This is usually when you release a new version of your container image.
kubectl describe pod [your-pod-name] will show a Last State which gives you a high level indication. To see what happened on the pod before it restarted, use kubectl logs your-pod-name --previous . You can pipe this to a file for inspection e.g.
I filed an issue to document the recommended practice. I put a sketch of the approach in the issue:
https://github.com/kubernetes/kubernetes/issues/20473
Container restarts, especially when they pull images, are fairly expensive for the system. The Kubelet backs off restarts of crashing containers in order to degrade gracefully with DOSing docker, the registry, the apiserver, etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With