Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

kubernetes pods stuck at containercreating

I have a raspberry pi cluster (one master , 3 nodes)

My basic image is : raspbian stretch lite

I already set up a basic kubernetes setup where a master can see all his nodes (kubectl get nodes) and they're all running. I used a weave network plugin for the network communication

When everything is all setup i tried to run a nginx pod (first with some replica's but now just 1 pod) on my cluster as followed kubectl run my-nginx --image=nginx

But somehow the pod get stuck in the status "Container creating" , when i run docker images i can't see the nginx image being pulled. And normally an nginx image is not that large so it had to be pulled already by now (15 minutes). The kubectl describe pods give the error that the pod sandbox failed to create and kubernetes will rec-create it.

I searched everything about this issue and tried the solutions on stackoverflow (reboot to restart cluster, searched describe pods , new network plugin tried it with flannel) but i can't see what the actual problem is. I did the exact same thing in Virtual box (just ubuntu not ARM ) and everything worked.

First i thougt it was a permission issue because i run everything as a normal user , but in vm i did the same thing and nothing changed. Then i checked kubectl get pods --all-namespaces to verify that the pods for the weaver network and kube-dns are running and also nothing wrong over there .

Is this a firewall issue in Raspberry pi ? Is the weave network plugin not compatible (even the kubernetes website says it is) with arm devices ? I 'am guessing there is an api network problem and thats why i can't get my pod runnning on a node

[EDIT] Log files

kubectl describe podName

>     
>     Name:           my-nginx-9d5677d94-g44l6 Namespace:      default Node: kubenode1/10.1.88.22 Start Time:     Tue, 06 Mar 2018 08:24:13
> +0000 Labels:         pod-template-hash=581233850
>                     run=my-nginx Annotations:    <none> Status:         Pending IP: Controlled By:  ReplicaSet/my-nginx-9d5677d94 Containers: 
> my-nginx:
>         Container ID:
>         Image:          nginx
>         Image ID:
>         Port:           80/TCP
>         State:          Waiting
>           Reason:       ContainerCreating
>         Ready:          False
>         Restart Count:  0
>         Environment:    <none>
>         Mounts:
>           /var/run/secrets/kubernetes.io/serviceaccount from default-token-phdv5 (ro) Conditions:   Type           Status  
> Initialized    True   Ready          False   PodScheduled   True
> Volumes:   default-token-phdv5:
>         Type:        Secret (a volume populated by a Secret)
>         SecretName:  default-token-phdv5
>         Optional:    false QoS Class:       BestEffort Node-Selectors:  <none> Tolerations:     node.kubernetes.io/not-ready:NoExecute for
> 300s
>                      node.kubernetes.io/unreachable:NoExecute for 300s Events:   Type     Reason                  Age   From               
> Message   ----     ------                  ----  ----               
>     -------   Normal   Scheduled               5m    default-scheduler   Successfully assigned my-nginx-9d5677d94-g44l6 to kubenode1   Normal  
> SuccessfulMountVolume   5m    kubelet, kubenode1  MountVolume.SetUp
> succeeded for volume "default-token-phdv5"   Warning 
> FailedCreatePodSandBox  1m    kubelet, kubenode1  Failed create pod
> sandbox.   Normal   SandboxChanged          1m    kubelet, kubenode1 
> Pod sandbox changed, it will be killed and re-created.

kubectl logs podName

Error from server (BadRequest): container "my-nginx" in pod "my-nginx-9d5677d94-g44l6" is waiting to start: ContainerCreating

journalctl -u kubelet gives this error

Mar 12 13:42:45 kubeMaster kubelet[16379]: W0312 13:42:45.824314   16379 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
Mar 12 13:42:45 kubeMaster kubelet[16379]: E0312 13:42:45.824816   16379 kubelet.go:2104] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

The problem seems to be with my network plugin. In my /etc/systemd/system/kubelet.service.d/10.kubeadm.conf . the flags for the network plugins are present ? environment= kubelet_network_args --cni-bin-dir=/etc/cni/net.d --network-plugin=cni

like image 764
achahbar Avatar asked Mar 05 '18 14:03

achahbar


People also ask

How do I resolve Imagepullbackoff in Kubernetes?

To resolve it, double check the pod specification and ensure that the repository and image are specified correctly. If this still doesn't work, there may be a network issue preventing access to the container registry. Look in the describe pod text file to obtain the hostname of the Kubernetes node.

What is ContainerCreating in Kubernetes?

ContainerCreating , for instance, implies that kube-scheduler has assigned a worker node for the container and has instructed Docker daemon to kick-off the workload. However, note that the network isn't provisioned at this stage — i.e, the workload does not have an IP address.

Why are pods in Kubernetes pending?

If a Pod is stuck in Pending it means that it can not be scheduled onto a node. Generally this is because there are insufficient resources of one type or another that prevent scheduling.

Why Kubernetes pod is not ready?

If a Pod is Running but not Ready it means that the Readiness probe is failing. When the Readiness probe is failing, the Pod isn't attached to the Service, and no traffic is forwarded to that instance.


2 Answers

You can see if it's network related by finding the node trying to pull the image:

kubectl describe pod <name> -n <namespace>

SSH to the node, and run docker pull nginx on it. If it's having issues pulling the image manually, then it might be network related.

like image 136
Ross Peoples Avatar answered Sep 25 '22 16:09

Ross Peoples


Thank you all for responding to my question. I solved my problem now. For anyone who has come to my question in the future the solution was as followed.

I cloned my raspberry pi images because i wanted a basicConfig.img for when i needed to add a new node to my cluster of when one gets down.

Weave network (the plugin i used) got confused because on every node and master the os had the same machine-id. When i deleted the machine id and created a new one (and reboot the nodes) my error got fixed. The commands to do this was

sudo rm /etc/machine-id
sudo rm /var/lib/dbus/machine-id
sudo dbus-uuidgen --ensure=/etc/machine-id

Once again my patience was being tested. Because my kubernetes setup was normal and my raspberry pi os was normal. I founded this with the help of someone in the kubernetes community. This again shows us how important and great our IT community is. To the people of the future who will come to this question. I hope this solution will fix your error and will decrease the amount of time you will be searching after a stupid small thing.

like image 38
achahbar Avatar answered Sep 24 '22 16:09

achahbar