Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

kubelet.service: Main process exited, code=exited, status=255/n/a

I am making test Cluster following this instructions: https://kubernetes.io/docs/getting-started-guides/fedora/fedora_manual_config/ and

https://kubernetes.io/docs/getting-started-guides/fedora/flannel_multi_node_cluster/ unfortunately when I check my nodes following occurs:

kubectl get no
NAME                        STATUS     ROLES     AGE       VERSION
pccshost2.lan.proficom.de   NotReady   <none>    19h       v1.10.3
pccshost3.lan.proficom.de   NotReady   <none>    19h       v1.10.3

so far as I get this problem is connected with not working kubelet.service on master-node:

systemctl status kubelet.service

kubelet.service - Kubernetes Kubelet Server
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2019-03-06 10:38:30 CET; 32min ago
     Docs: https://github.com/GoogleCloudPlatform/kubernetes
  Process: 14057 ExecStart=/usr/bin/kubelet $KUBE_LOGTOSTDERR $KUBE_LOG_LEVEL $KUBELET_API_SERVER $KUBELET_ADDRESS $KUBELET_PORT $KUBELET_HOSTNAME $KUBE_ALLOW_PRIV $KU>
 Main PID: 14057 (code=exited, status=255)
      CPU: 271ms

Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a
Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: kubelet.service: Failed with result 'exit-code'.
Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: kubelet.service: Consumed 271ms CPU time
Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: kubelet.service: Service RestartSec=100ms expired, scheduling restart.
Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 5.
Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: Stopped Kubernetes Kubelet Server.
Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: kubelet.service: Consumed 271ms CPU time
Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: kubelet.service: Start request repeated too quickly.
Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: kubelet.service: Failed with result 'exit-code'.
Mar 06 10:38:30 pccshost1.lan.proficom.de systemd[1]: Failed to start Kubernetes Kubelet Server.

~kubectl describe node

 Normal  Starting                 9s    kubelet, pccshost2.lan.proficom.de  Starting kubelet.
  Normal  NodeHasSufficientDisk    9s    kubelet, pccshost2.lan.proficom.de  Node pccshost2.lan.proficom.de status is now: NodeHasSufficientDisk
  Normal  NodeHasSufficientMemory  9s    kubelet, pccshost2.lan.proficom.de  Node pccshost2.lan.proficom.de status is now: NodeHasSufficientMemory
  Normal  NodeHasNoDiskPressure    9s    kubelet, pccshost2.lan.proficom.de  Node pccshost2.lan.proficom.de status is now: NodeHasNoDiskPressure
  Normal  NodeHasSufficientPID     9s    kubelet, pccshost2.lan.proficom.de  Node pccshost2.lan.proficom.de status is now: NodeHasSufficientPID

can somebody give an advice what is happening here and how can I fix it? Thx

like image 282
Roger Avatar asked Mar 06 '19 10:03

Roger


4 Answers

When you install k8s cluster using kubeadm and install kubelet in master(Ubuntu), it creates file "10-kubeadm.conf" located at /etc/systemd/system/kubelet.service.d

### kubelet contents
ExecStart=/usr/bin/kubelet
$KUBELET_KUBECONFIG_ARGS 
$KUBELET_CONFIG_ARGS 
$KUBELET_KUBEADM_ARGS 
$KUBELET_EXTRA_ARGS

The value of variable $KUBELET_KUBECONFIG_ARGS is /etc/kubernetes/kubelet.conf which contains the certificate signed by CA. Now, you need to verify the validity of the certificate. If the certificate has been expired then create the certificate using openssl and sign it with your CA.

Steps to verify the certificate

  1. Copy the value client-certificate-data.
  2. Decode the certificate ( echo -n "copied_certificate_value" | base64 --decode)
  3. Save the output in a file (vi kubelet.crt)
  4. Verify the validity (openssl x509 -in kubelet.crt -text -noout)

If the Vadility has expired then create a new certificate

Note: It is always safe to take backup before you start making any change cp -a /etc/kubernetes/ /root/

Steps to generate a new certificate

openssl genrsa -out kubelet.key 2048
openssl req -new -key kubelet.key -subj "/CN=kubelet" -out kubelet.csr
openssl x509 -req -in kubelet.csr -CA /etc/kubernetes/pki/ca.crt -CAkey /etc/kubernetes/pki/ca.key -out kubelet.crt -days 300

Encode the certificate files

cat kubelet.crt | base64
cat kubelet.key | base64

Copy the encoded content and update in /etc/kubernetes/kubelet.conf.

Now, check the status of kubelet on master node

systemctl status kubelet
systemctl restart kubelet #restart kubelet
like image 170
Israrul Haque Avatar answered Oct 22 '22 11:10

Israrul Haque


I had the same issue. Could not start kubelet service in master node.

Running the below command fixed my problem :

$ sudo swapoff -a

$ sudo systemctl restart kubelet.service

$ systemctl status kubelet

like image 20
shreyas.k Avatar answered Oct 22 '22 10:10

shreyas.k


I ran into the same issue, and found a solution here.

Essentially, I had to run the following commands:

swapoff -a
kubeadm reset
kubeadm init
systemctl status kubelet

Then I simply had to follow the on-screen instructions. My setup used weave-net for the pod network, so I also had to run kubectl apply -f weave-net.yaml.

like image 41
AdHorger Avatar answered Oct 22 '22 10:10

AdHorger


solved problem with kubelet adding --fail-swap-on=false" to KUBELET_ARGS= in Kubelet config file. But the problem with nodes stays same - status NotReady

like image 36
Roger Avatar answered Oct 22 '22 10:10

Roger