I am new to all things Kubernetes so still have much to learn.
Have created a two node Kubernetes cluster and both nodes (master and worker) are ready to do work which is good:
[monkey@k8s-dp1 nginx-test]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-dp1 Ready master 2h v1.9.1 k8s-dp2 Ready <none> 2h v1.9.1
Also, all Kubernetes Pods look okay:
[monkey@k8s-dp1 nginx-test]# kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system etcd-k8s-dp1 1/1 Running 0 2h kube-system kube-apiserver-k8s-dp1 1/1 Running 0 2h kube-system kube-controller-manager-k8s-dp1 1/1 Running 0 2h kube-system kube-dns-86cc76f8d-9jh2w 3/3 Running 0 2h kube-system kube-proxy-65mtx 1/1 Running 1 2h kube-system kube-proxy-wkkdm 1/1 Running 0 2h kube-system kube-scheduler-k8s-dp1 1/1 Running 0 2h kube-system weave-net-6sbbn 2/2 Running 0 2h kube-system weave-net-hdv9b 2/2 Running 3 2h
However, if I try to create a new deployment in the cluster, the deployment gets created but its pod fails to go into the appropriate RUNNING state. e.g.
[monkey@k8s-dp1 nginx-test]# kubectl apply -f https://k8s.io/docs/tasks/run-application/deployment.yaml deployment "nginx-deployment" created [monkey@k8s-dp1 nginx-test]# kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE default nginx-deployment-569477d6d8-f42pz 0/1 ContainerCreating 0 5s default nginx-deployment-569477d6d8-spjqk 0/1 ContainerCreating 0 5s kube-system etcd-k8s-dp1 1/1 Running 0 3h kube-system kube-apiserver-k8s-dp1 1/1 Running 0 3h kube-system kube-controller-manager-k8s-dp1 1/1 Running 0 3h kube-system kube-dns-86cc76f8d-9jh2w 3/3 Running 0 3h kube-system kube-proxy-65mtx 1/1 Running 1 2h kube-system kube-proxy-wkkdm 1/1 Running 0 3h kube-system kube-scheduler-k8s-dp1 1/1 Running 0 3h kube-system weave-net-6sbbn 2/2 Running 0 2h kube-system weave-net-hdv9b 2/2 Running 3 2h
I am not sure how to figure out what the problem is but if I for example do a kubectl get ev
, I can see the following suspect event:
<invalid> <invalid> 1 nginx-deployment-569477d6d8-f42pz.15087c66386edf5d Pod Warning FailedCreatePodSandBox kubelet, k8s-dp2 Failed create pod sandbox.
But I don't know where to go from here. I can also see that the nginx docker image itself never appears in docker images
.
How do I find out more about the problem? Am I missing something fundamental in the kubernetes setup?
--- NEW INFO ---
For background info in case it helps...
Kubernetes nodes are running on CentOS 7 VMs hosted on Windows 10 hyper-v.
--- NEW INFO ---
Running kubectl describe pods
shows the following Warning:
Warning NetworkNotReady 1m kubelet, k8s-dp2 network is not ready: [runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized]
--- NEW INFO ---
Switched off the Hyper-v VMs running Kubernetes for the night after my day job hours were over and on my return to the office this morning, I powered up the Kubernetes VMs once again to carry on and, for about 15 mins, the command:
kubectl get pods --all-namespaces
was still showing ContainerCreating
for those nginx pods the same as yesterday but, right now, the command is now showing all pods as Running
including the nginx pods... i.e. the problem solved itself after a full reboot of both master and worker node VMs.
I now did another full reboot again and all pods are showing as Running which is good.
1. In vSphere 7.0 U3, after an HA failover or reboot of a TKGS Worker Node, pods will show stuck in ContainerCreating state. 2. This condition is specifically seen when the TKGS Guest Cluster has Worker Nodes configured to use /var/lib/containerd ephemeral volumes.
If the output from a specific pod is desired, run the command kubectl describe pod pod_name --namespace kube-system . The Status field should be "Running" - any other status will indicate issues with the environment. In the Conditions section, the Ready field should indicate "True".
The status ImagePullBackOff means that a container could not start because Kubernetes could not pull a container image (for reasons such as invalid image name, or pulling from a private registry without imagePullSecret ).
If a Pod is stuck in Pending it means that it can not be scheduled onto a node. Generally this is because there are insufficient resources of one type or another that prevent scheduling.
Use kubectl describe pod <name>
to see more info
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With