Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prevent back-off in kubernetes crash loop

Tags:

kubernetes

I have a pod with some terrible, buggy software in it. One reason Kubernetes is great is that it'll just restart the software when it crashes, which is awesome.

Kubernetes was designed for good software, not terrible software, so it does an exponential backoff while restarting pods. This means I have to wait five minutes between crashes before my pods are restarted.

Is there any way to cap the kubernetes backoff strategy? I'd like to change it to not wait longer than thirty seconds before starting up the pod again.

like image 367
Riley Lark Avatar asked Apr 25 '16 15:04

Riley Lark


People also ask

What is crash loop back-off in Kubernetes?

CrashLoopBackOff is a Kubernetes state representing a restart loop that is happening in a Pod: a container in the Pod is started, but crashes and is then restarted, over and over again. Kubernetes will wait an increasing back-off time between restarts to give you a chance to fix the error.

What is back-off restarting failed container?

If you receive the "Back-Off restarting failed container" output message, then your container probably exited soon after Kubernetes started the container.


1 Answers

Unfortunately, the max back off time for container restarts is not tunable for the node reliability (i.e., too many container restarts can overwhelm the node). If you absolutely want to change it in your cluster, you will need to modify the max backoff time in the code, compile your own kubelet binary, and distribute it onto your nodes.

like image 73
Yu-Ju Hong Avatar answered Sep 26 '22 01:09

Yu-Ju Hong