When I try to run my elasticsearch container through kubernetes deployments, my elasticsearch pod fails after some time, While it runs perfectly fine when directly run as docker container using docker-compose or Dockerfile. This is what I get as a result of kubectl get pods
NAME READY STATUS RESTARTS AGE
es-764bd45bb6-w4ckn 0/1 Error 4 3m
below is the result of kubectl describe pod
Name: es-764bd45bb6-w4ckn
Namespace: default
Node: administrator-thinkpad-l480/<node_ip>
Start Time: Thu, 30 Aug 2018 16:38:08 +0530
Labels: io.kompose.service=es
pod-template-hash=3206801662
Annotations: <none>
Status: Running
IP: 10.32.0.8
Controlled By: ReplicaSet/es-764bd45bb6
Containers:
es:
Container ID: docker://9be2f7d6eb5d7793908852423716152b8cefa22ee2bb06fbbe69faee6f6aa3c3
Image: docker.elastic.co/elasticsearch/elasticsearch:6.2.4
Image ID: docker-pullable://docker.elastic.co/elasticsearch/elasticsearch@sha256:9ae20c753f18e27d1dd167b8675ba95de20b1f1ae5999aae5077fa2daf38919e
Port: 9200/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 78
Started: Thu, 30 Aug 2018 16:42:56 +0530
Finished: Thu, 30 Aug 2018 16:43:07 +0530
Ready: False
Restart Count: 5
Environment:
ELASTICSEARCH_ADVERTISED_HOST_NAME: es
ES_JAVA_OPTS: -Xms2g -Xmx2g
ES_HEAP_SIZE: 2GB
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-nhb9z (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-nhb9z:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-nhb9z
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 6m default-scheduler Successfully assigned default/es-764bd45bb6-w4ckn to administrator-thinkpad-l480
Normal Pulled 3m (x5 over 6m) kubelet, administrator-thinkpad-l480 Container image "docker.elastic.co/elasticsearch/elasticsearch:6.2.4" already present on machine
Normal Created 3m (x5 over 6m) kubelet, administrator-thinkpad-l480 Created container
Normal Started 3m (x5 over 6m) kubelet, administrator-thinkpad-l480 Started container
Warning BackOff 1m (x15 over 5m) kubelet, administrator-thinkpad-l480 Back-off restarting failed container
Here is my elasticsearc-deployment.yaml:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
annotations:
kompose.cmd: kompose convert
kompose.version: 1.1.0 (36652f6)
creationTimestamp: null
labels:
io.kompose.service: es
name: es
spec:
replicas: 1
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
io.kompose.service: es
spec:
containers:
- env:
- name: ELASTICSEARCH_ADVERTISED_HOST_NAME
value: es
- name: ES_JAVA_OPTS
value: -Xms2g -Xmx2g
- name: ES_HEAP_SIZE
value: 2GB
image: docker.elastic.co/elasticsearch/elasticsearch:6.2.4
name: es
ports:
- containerPort: 9200
resources: {}
restartPolicy: Always
status: {}
When i try to get logs using kubectl logs -f es-764bd45bb6-w4ckn
, I get
Error from server: Get https://<slave node ip>:10250/containerLogs/default/es-764bd45bb6-w4ckn/es?previous=true: dial tcp <slave node ip>:10250: i/o timeout
What could be the reason and solution for this problem ?
1. Check for “Back Off Restarting Failed Container” Run kubectl describe pod [name] . If you get a Liveness probe failed and Back-off restarting failed container messages from the kubelet, as shown below, this indicates the container is not responding and is in the process of restarting.
When a container is out of memory, or OOM, it is restarted by its pod according to the restart policy. The default restart policy will eventually back off on restarting the pod if it restarts many times in a short time span.
If a Pod is scheduled to a node that then fails, the Pod is deleted; likewise, a Pod won't survive an eviction due to a lack of resources or Node maintenance.
I had the same problem, there can be couple of reasons for this issue. In my case the jar file was missing. @Lakshya has already answered this problem, I would like to add the steps that you can take to troubleshoot it.
If your container is up, you can use the kubectl exec -it command to further analyse the container
Hope it helps community members in future issues.
I found the logs using docker logs
for the es container and found that es was not starting because of the vm.max_map_count
set to very low value.
I changed the vm.max_map_count
to desired value using sysctl -w vm.max_map_count=262144
and the pod has started after that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With