Debugging a container in a crash loop on Kubernetes

Tags:

In GKE, I have a pod with two containers. They use the same image, and the only difference is that I am passing them slightly different flags. One runs fine, the other goes in a crash loop. How can I debug the reason for the failure?

My pod definition is

apiVersion: v1
kind: ReplicationController
metadata:
  name: doorman-client
spec:
  replicas: 10
  selector:
    app: doorman-client
  template:
    metadata:
      name: doorman-client
      labels:
        app: doorman-client
    spec:
      containers:
        - name: doorman-client-proportional
          resources:
            limits:
              cpu: 10m
          image: gcr.io/google.com/doorman/doorman-client:v0.1.1
          command: 
            - client
            - -port=80
            - -count=50
            - -initial_capacity=15
            - -min_capacity=5
            - -max_capacity=2000
            - -increase_chance=0.1
            - -decrease_chance=0.05
            - -step=5
            - -resource=proportional
            - -addr=$(DOORMAN_SERVICE_HOST):$(DOORMAN_SERVICE_PORT_GRPC)
            - -vmodule=doorman_client=2
            - --logtostderr
          ports:
            - containerPort: 80
              name: http

        - name: doorman-client-fair
          resources:
            limits:
              cpu: 10m
          image: gcr.io/google.com/doorman/doorman-client:v0.1.1
          command: 
            - client
            - -port=80
            - -count=50
            - -initial_capacity=15
            - -min_capacity=5
            - -max_capacity=2000
            - -increase_chance=0.1
            - -decrease_chance=0.05
            - -step=5
            - -resource=fair
            - -addr=$(DOORMAN_SERVICE_HOST):$(DOORMAN_SERVICE_PORT_GRPC)
            - -vmodule=doorman_client=2
            - --logtostderr
          ports:
            - containerPort: 80
              name: http

kubectl describe gives me the following:

6:06 [0] (szopa szopa-macbookpro):~/GOPATH/src/github.com/youtube/doorman$ kubectl describe pod doorman-client-tylba
Name:               doorman-client-tylba
Namespace:          default
Image(s):           gcr.io/google.com/doorman/doorman-client:v0.1.1,gcr.io/google.com/doorman/doorman-client:v0.1.1
Node:               gke-doorman-loadtest-d75f7d0f-node-k9g6/10.240.0.4
Start Time:         Sun, 21 Feb 2016 16:05:42 +0100
Labels:             app=doorman-client
Status:             Running
Reason:
Message:
IP:             10.128.4.182
Replication Controllers:    doorman-client (10/10 replicas created)
Containers:
  doorman-client-proportional:
    Container ID:   docker://0bdcb8269c5d15a4f99ccc0b0ee04bf3e9fd0db9fd23e9c0661e06564e9105f7
    Image:      gcr.io/google.com/doorman/doorman-client:v0.1.1
    Image ID:       docker://a603248608898591c84216dd3172aaa7c335af66a57fe50fd37a42394d5631dc
    QoS Tier:
      cpu:  Guaranteed
    Limits:
      cpu:  10m
    Requests:
      cpu:      10m
    State:      Running
      Started:      Sun, 21 Feb 2016 16:05:42 +0100
    Ready:      True
    Restart Count:  0
    Environment Variables:
  doorman-client-fair:
    Container ID:   docker://92fea92f1307b943d0ea714441417d4186c5ac6a17798650952ea726d18dba68
    Image:      gcr.io/google.com/doorman/doorman-client:v0.1.1
    Image ID:       docker://a603248608898591c84216dd3172aaa7c335af66a57fe50fd37a42394d5631dc
    QoS Tier:
      cpu:  Guaranteed
    Limits:
      cpu:  10m
    Requests:
      cpu:          10m
    State:          Running
      Started:          Sun, 21 Feb 2016 16:06:03 +0100
    Last Termination State: Terminated
      Reason:           Error
      Exit Code:        0
      Started:          Sun, 21 Feb 2016 16:05:43 +0100
      Finished:         Sun, 21 Feb 2016 16:05:44 +0100
    Ready:          False
    Restart Count:      2
    Environment Variables:
Conditions:
  Type      Status
  Ready     False
Volumes:
  default-token-ihani:
    Type:   Secret (a secret that should populate this volume)
    SecretName: default-token-ihani
Events:
  FirstSeen LastSeen    Count   From                            SubobjectPath                   Reason      Message
  ───────── ────────    ─────   ────                            ─────────────                   ──────      ───────
  29s       29s     1   {scheduler }                                                Scheduled   Successfully assigned doorman-client-tylba to gke-doorman-loadtest-d75f7d0f-node-k9g6
  29s       29s     1   {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6}   implicitly required container POD       Pulled      Container image "gcr.io/google_containers/pause:0.8.0" already present on machine
  29s       29s     1   {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6}   implicitly required container POD       Created     Created with docker id 5013851c67d9
  29s       29s     1   {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6}   implicitly required container POD       Started     Started with docker id 5013851c67d9
  29s       29s     1   {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6}   spec.containers{doorman-client-proportional}    Created     Created with docker id 0bdcb8269c5d
  29s       29s     1   {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6}   spec.containers{doorman-client-proportional}    Started     Started with docker id 0bdcb8269c5d
  29s       29s     1   {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6}   spec.containers{doorman-client-fair}        Created     Created with docker id ed0928176958
  29s       29s     1   {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6}   spec.containers{doorman-client-fair}        Started     Started with docker id ed0928176958
  28s       28s     1   {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6}   spec.containers{doorman-client-fair}        Created     Created with docker id 0a73290085b6
  28s       28s     1   {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6}   spec.containers{doorman-client-fair}        Started     Started with docker id 0a73290085b6
  18s       18s     1   {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6}   spec.containers{doorman-client-fair}        Backoff     Back-off restarting failed docker container
  8s        8s      1   {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6}   spec.containers{doorman-client-fair}        Started     Started with docker id 92fea92f1307
  29s       8s      4   {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6}   spec.containers{doorman-client-fair}        Pulled      Container image "gcr.io/google.com/doorman/doorman-client:v0.1.1" already present on machine
  8s        8s      1   {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6}   spec.containers{doorman-client-fair}        Created     Created with docker id 92fea92f1307

As you can see, the exit code is zero, with the message being "Error", which is not super helpful.

I tried:

changing the order of the definitions (firs one always runs, second one always fails).
changing the used ports to be different (no effect)
changing the name of the ports to be different (no effect).

305

asked Feb 21 '16 15:02

Ryszard Szopa

1 Answers

It's tough to say exactly without knowing more about your app, but the two containers definitely can't use the same port if they're part of the same pod. In kubernetes, each pod gets its own IP address, but each container in the pod shares that same IP address. That's why you can't have more than one of them using the same port unless you split them into separate pods.

To get more info, I'd recommend using the kubectl logs [pod] [optional container name] command, which can be used to get the stdout/stderr from a container. The -p flag can be used to get the logs from the most recently failed container.

135

answered Oct 16 '22 03:10

Alex Robinson

Related questions
                            
                                Exclude Resource in kustomization.yaml
                            
                                On kubernetes helm how to replace a pod with new config values
                            
                                Create Daemonset using kubectl?
                            
                                How to hide a column with Kubectl
                            
                                MountVolume.SetUp failed for volume "mongo" : hostPath type check failed: /mongo/data is not a directory
                            
                                How can I install minikube on Mac OS Catalina
                            
                                WIll "helm upgrade" restart PODS even if they are not affected by upgrade?
                            
                                Kubernetes set-up on ubuntu on Google compute
                            
                                minikube does not start on ubuntu 20.04 LTS. Exiting due to GUEST_PROVISION
                            
                                How can I correctly setup custom headers with nginx ingress?
                            
                                GCE ingress with routes always falls back to default-http-backend
                            
                                VPN access for applications running inside a shared Kubernetes cluster
                            
                                Setting up Kubernetes on NixOS
                            
                                Change Kubernetes docker-for-desktop cluster network ip
                            
                                what are the kubernetes/elb time outs for http requests?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Debugging a container in a crash loop on Kubernetes

Tags:

kubernetes

google-kubernetes-engine

Ryszard Szopa

People also ask

1 Answers

Alex Robinson

Recent Activity

Donate For Us