Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

kubelet won't start after kuberntes/manifest update

This is sort of strange behavior in our K8 cluster.

When we try to deploy a new version of our applications we get:

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "<container-id>" network for pod "application-6647b7cbdb-4tp2v": networkPlugin cni failed to set up pod "application-6647b7cbdb-4tp2v_default" network: Get "https://[10.233.0.1]:443/api/v1/namespaces/default": dial tcp 10.233.0.1:443: connect: connection refused

I used kubectl get cs and found controller and scheduler in Unhealthy state.

As describer here updated /etc/kubernetes/manifests/kube-scheduler.yaml and /etc/kubernetes/manifests/kube-controller-manager.yaml by commenting --port=0

When I checked systemctl status kubelet it was working.

Active: active (running) since Mon 2020-10-26 13:18:46 +0530; 1 years 0 months ago

I had restarted kubelet service and controller and scheduler were shown healthy.

But systemctl status kubelet shows (soon after restart kubelet it showed running state)

Active: activating (auto-restart) (Result: exit-code) since Thu 2021-11-11 10:50:49 +0530; 3s ago<br>
    Docs: https://github.com/GoogleCloudPlatform/kubernetes<br>  Process: 21234 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET

Tried adding Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --fail-swap-on=false" to /etc/systemd/system/kubelet.service.d/10-kubeadm.conf as described here, but still its not working properly.

Also removed --port=0 comment in above mentioned manifests and tried restarting,still same result.

Edit: This issue was due to kubelet certificate expired and fixed following these steps. If someone faces this issue, make sure /var/lib/kubelet/pki/kubelet-client-current.pem certificate and key values are base64 encoded when placing on /etc/kubernetes/kubelet.conf

Many other suggested kubeadm init again. But this cluster was created using kubespray no manually added nodes.

We have baremetal k8 running on Ubuntu 18.04. K8: v1.18.8

We would like to know any debugging and fixing suggestions.

PS:
When we try to telnet 10.233.0.1 443 from any node, first attempt fails and second attempt success.

Edit: Found this in kubelet service logs

Nov 10 17:35:05 node1 kubelet[1951]: W1110 17:35:05.380982    1951 docker_sandbox.go:402] failed to read pod IP from plugin/docker: networkPlugin cni failed on the status hook for pod "app-7b54557dd4-bzjd9_default": unexpected command output nsenter: cannot open /proc/12311/ns/net: No such file or directory
like image 643
Sachith Muhandiram Avatar asked Nov 11 '21 05:11

Sachith Muhandiram


People also ask

How can I check my Kubelet status?

Using kubectl describe pods to check kube-system If the output from a specific pod is desired, run the command kubectl describe pod pod_name --namespace kube-system . The Status field should be "Running" - any other status will indicate issues with the environment.

What is difference between kubectl and Kubelet?

kubelet: the component that runs on all of the machines in your cluster and does things like starting PODs and containers. kubectl: the command line until to talk to your cluster.

Does Kubelet run as a container?

One of the kubelet's jobs is to start and stop containers, and the CRI is the interface that the kubelet uses to interact with container runtimes. For example, containerd is categorized as a container runtime because it takes an image and creates a running container.

How to check if kubelet is working or not?

1) check status of your docker service. If stoped,start it by cmd sudo systemctl start docekr. If not installed installed it #yum install -y kubelet kubeadm kubectl docker 4) Now try #kudeadm init after that check #systemctl status kubelet it will be working

How to create a Kubernetes kubelet from a failed node?

From a working control plane node in the cluster that has /etc/kubernetes/pki/ca. key execute kubeadm kubeconfig user --org system:nodes --client-name system:node:$NODE > kubelet. conf. $NODE must be set to the name of the existing failed node in the cluster. Modify the resulted kubelet.

What to do if bootstrap-kubelet is not working?

the bootstrap-kubelet.conf missing should not be an issue as long as there is a kubelet.conf, see the kubelet flags: | Path to a kubeconfig file that will be used to get client certificate for kubelet.

How to migrate off the dynamic kubelet configuration feature?

If you are using kubeadm, refer to Configuring each kubelet in your cluster using kubeadm. In order to migrate off the Dynamic Kubelet Configuration feature, the alternative mechanism should be used to distribute kubelet configuration files. In order to apply configuration, config file must be updated and kubelet restarted.


1 Answers

Posting comment as the community wiki answer for better visibility


This issue was due to kubelet certificate expired and fixed following these steps. If someone faces this issue, make sure /var/lib/kubelet/pki/kubelet-client-current.pem certificate and key values are base64 encoded when placing on /etc/kubernetes/kubelet.conf

like image 187
Bazhikov Avatar answered Oct 19 '22 07:10

Bazhikov