I have a pod with some terrible, buggy software in it. One reason Kubernetes is great is that it'll just restart the software when it crashes, which is awesome.
Kubernetes was designed for good software, not terrible software, so it does an exponential backoff while restarting pods. This means I have to wait five minutes between crashes before my pods are restarted.
Is there any way to cap the kubernetes backoff strategy? I'd like to change it to not wait longer than thirty seconds before starting up the pod again.
CrashLoopBackOff is a Kubernetes state representing a restart loop that is happening in a Pod: a container in the Pod is started, but crashes and is then restarted, over and over again. Kubernetes will wait an increasing back-off time between restarts to give you a chance to fix the error.
If you receive the "Back-Off restarting failed container" output message, then your container probably exited soon after Kubernetes started the container.
Unfortunately, the max back off time for container restarts is not tunable for the node reliability (i.e., too many container restarts can overwhelm the node). If you absolutely want to change it in your cluster, you will need to modify the max backoff time in the code, compile your own kubelet binary, and distribute it onto your nodes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With