Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

My kubernetes pods keep crashing with "CrashLoopBackOff" but I can't find any log

Tags:

kubernetes

People also ask

How do you get logs of crashed pods in Kubernetes?

Checking the logs of a crashed pod In case that a pod restarts, and you wanted to check the logs of the previous run, what you need to do is to use the --previous flag: kubectl logs nginx-7d8b49557c-c2lx9 --previous.

Why is my pod CrashLoopBackOff?

What Does CrashLoopBackOff mean? CrashLoopBackOff is a status message that indicates one of your pods is in a constant state of flux—one or more containers are failing and restarting repeatedly. This typically happens because each pod inherits a default restartPolicy of Always upon creation.

How do you check logs of pods in Kubernetes?

To get Kubectl pod logs, you can access them by adding the -p flag. Kubectl will then get all of the logs stored for the pod. This includes lines that were emitted by containers that were terminated.


As @Sukumar commented, you need to have your Dockerfile have a Command to run or have your ReplicationController specify a command.

The pod is crashing because it starts up then immediately exits, thus Kubernetes restarts and the cycle continues.


kubectl -n <namespace-name> describe pod <pod name>

kubectl -n <namespace-name> logs -p  <pod name> 

If you have an application that takes slower to bootstrap, it could be related to the initial values of the readiness/liveness probes. I solved my problem by increasing the value of initialDelaySeconds to 120s as my SpringBoot application deals with a lot of initialization. The documentation does not mention the default 0 (https://kubernetes.io/docs/api-reference/v1.9/#probe-v1-core)

service:
  livenessProbe:
    httpGet:
      path: /health/local
      scheme: HTTP
      port: 8888
    initialDelaySeconds: 120
    periodSeconds: 5
    timeoutSeconds: 5
    failureThreshold: 10
  readinessProbe:
    httpGet:
      path: /admin/health
      scheme: HTTP
      port: 8642
    initialDelaySeconds: 150
    periodSeconds: 5
    timeoutSeconds: 5
    failureThreshold: 10

A very good explanation about those values is given by What is the default value of initialDelaySeconds.

The health or readiness check algorithm works like:

  1. wait for initialDelaySeconds
  2. perform check and wait timeoutSeconds for a timeout if the number of continued successes is greater than successThreshold return success
  3. if the number of continued failures is greater than failureThreshold return failure otherwise wait periodSeconds and start a new check

In my case, my application can now bootstrap in a very clear way, so that I know I will not get periodic crashloopbackoff because sometimes it would be on the limit of those rates.


I had the need to keep a pod running for subsequent kubectl exec calls and as the comments above pointed out my pod was getting killed by my k8s cluster because it had completed running all its tasks. I managed to keep my pod running by simply kicking the pod with a command that would not stop automatically as in:

kubectl run YOUR_POD_NAME -n YOUR_NAMESPACE --image SOME_PUBLIC_IMAGE:latest --command tailf /dev/null

My pod kept crashing and I was unable to find the cause. Luckily there is a space where kubernetes saves all the events that occurred before my pod crashed.
(#List Events sorted by timestamp)

To see these events run the command:

kubectl get events --sort-by=.metadata.creationTimestamp

make sure to add a --namespace mynamespace argument to the command if needed

The events shown in the output of the command showed my why my pod kept crashing.


From This page, the container dies after running everything correctly but crashes because all the commands ended. Either you make your services run on the foreground, or you create a keep alive script. By doing so, Kubernetes will show that your application is running. We have to note that in the Docker environment, this problem is not encountered. It is only Kubernetes that wants a running app.

Update (an example):

Here's how to avoid CrashLoopBackOff, when launching a Netshoot container:

kubectl run netshoot --image nicolaka/netshoot -- sleep infinity

In your yaml file, add command and args lines:

...
containers:
      - name: api
        image: localhost:5000/image-name 
        command: [ "sleep" ]
        args: [ "infinity" ]
...

Works for me.