When I try to run my elasticsearch container through kubernetes deployments, my elasticsearch pod fails after some time, While it runs perfectly fine when directly run as docker container using docker-compose or Dockerfile. This is what I get as a result of <code>kubectl get pods</code> <pre class="prettyprint"><code>NAME READY STATUS RESTARTS AGE es-764bd45bb6-w4ckn 0/1 Error 4 3m </code></pre> below is the result of <code>kubectl describe pod</code> <pre class="prettyprint"><code>Name: es-764bd45bb6-w4ckn Namespace: default Node: administrator-thinkpad-l480/<node_ip> Start Time: Thu, 30 Aug 2018 16:38:08 +0530 Labels: io.kompose.service=es pod-template-hash=3206801662 Annotations: <none> Status: Running IP: 10.32.0.8 Controlled By: ReplicaSet/es-764bd45bb6 Containers: es: Container ID: docker://9be2f7d6eb5d7793908852423716152b8cefa22ee2bb06fbbe69faee6f6aa3c3 Image: docker.elastic.co/elasticsearch/elasticsearch:6.2.4 Image ID: docker-pullable://docker.elastic.co/elasticsearch/elasticsearch@sha256:9ae20c753f18e27d1dd167b8675ba95de20b1f1ae5999aae5077fa2daf38919e Port: 9200/TCP State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 78 Started: Thu, 30 Aug 2018 16:42:56 +0530 Finished: Thu, 30 Aug 2018 16:43:07 +0530 Ready: False Restart Count: 5 Environment: ELASTICSEARCH_ADVERTISED_HOST_NAME: es ES_JAVA_OPTS: -Xms2g -Xmx2g ES_HEAP_SIZE: 2GB Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-nhb9z (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: default-token-nhb9z: Type: Secret (a volume populated by a Secret) SecretName: default-token-nhb9z Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 6m default-scheduler Successfully assigned default/es-764bd45bb6-w4ckn to administrator-thinkpad-l480 Normal Pulled 3m (x5 over 6m) kubelet, administrator-thinkpad-l480 Container image "docker.elastic.co/elasticsearch/elasticsearch:6.2.4" already present on machine Normal Created 3m (x5 over 6m) kubelet, administrator-thinkpad-l480 Created container Normal Started 3m (x5 over 6m) kubelet, administrator-thinkpad-l480 Started container Warning BackOff 1m (x15 over 5m) kubelet, administrator-thinkpad-l480 Back-off restarting failed container </code></pre> Here is my elasticsearc-deployment.yaml: <pre class="prettyprint lang-yaml prettyprint-override"><code>apiVersion: extensions/v1beta1 kind: Deployment metadata: annotations: kompose.cmd: kompose convert kompose.version: 1.1.0 (36652f6) creationTimestamp: null labels: io.kompose.service: es name: es spec: replicas: 1 strategy: {} template: metadata: creationTimestamp: null labels: io.kompose.service: es spec: containers: - env: - name: ELASTICSEARCH_ADVERTISED_HOST_NAME value: es - name: ES_JAVA_OPTS value: -Xms2g -Xmx2g - name: ES_HEAP_SIZE value: 2GB image: docker.elastic.co/elasticsearch/elasticsearch:6.2.4 name: es ports: - containerPort: 9200 resources: {} restartPolicy: Always status: {} </code></pre> When i try to get logs using <code>kubectl logs -f es-764bd45bb6-w4ckn</code>, I get <pre class="prettyprint"><code>Error from server: Get https://<slave node ip>:10250/containerLogs/default/es-764bd45bb6-w4ckn/es?previous=true: dial tcp <slave node ip>:10250: i/o timeout </code></pre> What could be the reason and solution for this problem ?

I had the same problem, there can be couple of reasons for this issue. In my case the jar file was missing. @Lakshya has already answered this problem, I would like to add the steps that you can take to troubleshoot it. <ol> <li>Get the pod status, Command - kubectl get pods </li> <li>Describe pod to have further look - kubectl describe pod "pod-name" The last few lines of output gives you events and where your deployment failed </li> <li>Get logs for more details - kubectl logs "pod-name" </li> <li>Get container logs - kubectl logs "pod-name" -c "container-name" Get the container name from the output of describe pod command </li> </ol> If your container is up, you can use the kubectl exec -it command to further analyse the container Hope it helps community members in future issues.

I found the logs using <code>docker logs</code> for the es container and found that es was not starting because of the <code>vm.max_map_count</code> set to very low value. I changed the <code>vm.max_map_count</code> to desired value using <code>sysctl -w vm.max_map_count=262144</code> and the pod has started after that.

What is the reason for Back-off restarting failed container for elasticsearch kubernetes pod?

Tags:

docker

elasticsearch

kubernetes

When I try to run my elasticsearch container through kubernetes deployments, my elasticsearch pod fails after some time, While it runs perfectly fine when directly run as docker container using docker-compose or Dockerfile. This is what I get as a result of kubectl get pods

NAME                  READY     STATUS    RESTARTS   AGE
es-764bd45bb6-w4ckn   0/1       Error     4          3m

below is the result of kubectl describe pod

Name:           es-764bd45bb6-w4ckn
Namespace:      default
Node:           administrator-thinkpad-l480/<node_ip>
Start Time:     Thu, 30 Aug 2018 16:38:08 +0530
Labels:         io.kompose.service=es
            pod-template-hash=3206801662
Annotations:    <none> 
Status:         Running
IP:             10.32.0.8
Controlled By:  ReplicaSet/es-764bd45bb6
Containers:
es:
Container ID:   docker://9be2f7d6eb5d7793908852423716152b8cefa22ee2bb06fbbe69faee6f6aa3c3
Image:          docker.elastic.co/elasticsearch/elasticsearch:6.2.4
Image ID:       docker-pullable://docker.elastic.co/elasticsearch/elasticsearch@sha256:9ae20c753f18e27d1dd167b8675ba95de20b1f1ae5999aae5077fa2daf38919e
Port:           9200/TCP
State:          Waiting
  Reason:       CrashLoopBackOff
Last State:     Terminated
  Reason:       Error
  Exit Code:    78
  Started:      Thu, 30 Aug 2018 16:42:56 +0530
  Finished:     Thu, 30 Aug 2018 16:43:07 +0530
Ready:          False
Restart Count:  5
Environment:
  ELASTICSEARCH_ADVERTISED_HOST_NAME:  es
  ES_JAVA_OPTS:                        -Xms2g -Xmx2g
  ES_HEAP_SIZE:                        2GB
Mounts:
  /var/run/secrets/kubernetes.io/serviceaccount from default-token-nhb9z (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-nhb9z:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-nhb9z
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
             node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age               From           Message
  ----     ------     ----              ----           -------
 Normal   Scheduled  6m                default-scheduler                     Successfully assigned default/es-764bd45bb6-w4ckn to administrator-thinkpad-l480
 Normal   Pulled     3m (x5 over 6m)   kubelet, administrator-thinkpad-l480  Container image "docker.elastic.co/elasticsearch/elasticsearch:6.2.4" already present on machine
 Normal   Created    3m (x5 over 6m)   kubelet, administrator-thinkpad-l480  Created container
 Normal   Started    3m (x5 over 6m)   kubelet, administrator-thinkpad-l480  Started container
 Warning  BackOff    1m (x15 over 5m)  kubelet, administrator-thinkpad-l480  Back-off restarting failed container

Here is my elasticsearc-deployment.yaml:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    kompose.cmd: kompose convert
    kompose.version: 1.1.0 (36652f6)
  creationTimestamp: null
  labels:
    io.kompose.service: es
  name: es
spec:
  replicas: 1
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        io.kompose.service: es
    spec:
      containers:
      - env:
        - name: ELASTICSEARCH_ADVERTISED_HOST_NAME
          value: es
        - name: ES_JAVA_OPTS
          value: -Xms2g -Xmx2g
        - name: ES_HEAP_SIZE
          value: 2GB
        image: docker.elastic.co/elasticsearch/elasticsearch:6.2.4
        name: es
        ports:
        - containerPort: 9200
        resources: {}
      restartPolicy: Always
 status: {}

When i try to get logs using kubectl logs -f es-764bd45bb6-w4ckn, I get

Error from server: Get https://<slave node ip>:10250/containerLogs/default/es-764bd45bb6-w4ckn/es?previous=true: dial tcp <slave node ip>:10250: i/o timeout

What could be the reason and solution for this problem ?

717

asked Aug 30 '18 11:08

Lakshya Garg

2 Answers

I had the same problem, there can be couple of reasons for this issue. In my case the jar file was missing. @Lakshya has already answered this problem, I would like to add the steps that you can take to troubleshoot it.

Get the pod status, Command - kubectl get pods
Describe pod to have further look - kubectl describe pod "pod-name" The last few lines of output gives you events and where your deployment failed
Get logs for more details - kubectl logs "pod-name"
Get container logs - kubectl logs "pod-name" -c "container-name" Get the container name from the output of describe pod command

If your container is up, you can use the kubectl exec -it command to further analyse the container

Hope it helps community members in future issues.

151

answered Sep 28 '22 05:09

Pradeep

I found the logs using docker logs for the es container and found that es was not starting because of the vm.max_map_count set to very low value. I changed the vm.max_map_count to desired value using sysctl -w vm.max_map_count=262144 and the pod has started after that.

answered Sep 28 '22 05:09

Lakshya Garg

Related questions
                            
                                Inject code/files directly into a container in Kubernetes on Google Cloud Engine
                            
                                add hosts redirection in docker
                            
                                Connect two instances of docker-compose
                            
                                Update a docker image in registry
                            
                                Connect to MySQL server from php with Docker
                            
                                remove docker repository on remote docker registry
                            
                                How to temporarily disable docker support from asp.net core 2.0 project in visual studio 2017 ver 15.3.2?
                            
                                How to speed up Docker build
                            
                                Celery multi inside docker container
                            
                                Running redis on nodejs Docker image
                            
                                dynamo db local shell doesn't list tables using docker image
                            
                                Docker: npm not found
                            
                                Disable Docker Upon Startup in Ubuntu 20.04
                            
                                Docker how to ADD a file without committing it to an image?
                            
                                apt-get in docker behind corporate proxy
                            
                                docker run [9] System error: exec format error
                            
                                docker-compose volume is empty even from initialize
                            
                                Docker - how to set iface name when creating a new network
                            
                                How to install global module in docker?
                            
                                Got permission denied while trying to connect to the Docker daemon socket while executing docker stop

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the reason for Back-off restarting failed container for elasticsearch kubernetes pod?

Tags:

docker

elasticsearch

kubernetes

Lakshya Garg

People also ask

2 Answers

Pradeep

Lakshya Garg

Recent Activity

Donate For Us