Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kubeadm why does my node not show up though kubelet says it joined?

I am setting up a Kubernetes deployment using auto-scaling groups and Terraform. The kube master node is behind an ELB to get some reliability in case of something going wrong. The ELB has the health check set to tcp 6443, and tcp listeners for 8080, 6443, and 9898. All of the instances and the load balancer belong to a security group that allows all traffic between members of the group, plus public traffic from the NAT Gateway address. I created my AMI using the following script (from the getting started guide)...

# curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
# cat <<EOF > /etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
# apt-get update
# # Install docker if you don't have it already.
# apt-get install -y docker.io
# apt-get install -y kubelet kubeadm kubectl kubernetes-cni

I use the following user data scripts...

kube master

#!/bin/bash
rm -rf /etc/kubernetes/*
rm -rf /var/lib/kubelet/*

kubeadm init \
  --external-etcd-endpoints=http://${etcd_elb}:2379 \
  --token=${token} \
  --use-kubernetes-version=${k8s_version} \
  --api-external-dns-names=kmaster.${master_elb_dns} \
  --cloud-provider=aws
until kubectl cluster-info
do
  sleep 1
done
kubectl apply -f https://git.io/weave-kube

kube node

#!/bin/bash
rm -rf /etc/kubernetes/*
rm -rf /var/lib/kubelet/*

until kubeadm join --token=${token} kmaster.${master_elb_dns}
do
  sleep 1
done

Everything seems to work properly. The master comes up and responds to kubectl commands, with pods for discovery, dns, weave, controller-manager, api-server, and scheduler. kubeadm has the following output on the node...

Running pre-flight checks
<util/tokens> validating provided token
<node/discovery> created cluster info discovery client, requesting info from "http://kmaster.jenkins.learnvest.net:9898/cluster-info/v1/?token-id=eb31c0"
node/discovery> failed to request cluster info, will try again: [Get http://kmaster.jenkins.learnvest.net:9898/cluster-info/v1/?token-id=eb31c0: EOF]
<node/discovery> cluster info object received, verifying signature using given token
<node/discovery> cluster info signature and contents are valid, will use API endpoints [https://10.253.129.106:6443]
<node/bootstrap> trying to connect to endpoint https://10.253.129.106:6443
<node/bootstrap> detected server version v1.4.4
<node/bootstrap> successfully established connection with endpoint https://10.253.129.106:6443
<node/csr> created API client to obtain unique certificate for this node, generating keys and certificate signing request
<node/csr> received signed certificate from the API server:
Issuer: CN=kubernetes | Subject: CN=system:node:ip-10-253-130-44 | CA: false
Not before: 2016-10-27 18:46:00 +0000 UTC Not After: 2017-10-27 18:46:00 +0000 UTC
<node/csr> generating kubelet configuration
<util/kubeconfig> created "/etc/kubernetes/kubelet.conf"

Node join complete:
* Certificate signing request sent to master and response
  received.
* Kubelet informed of new secure connection details.

Run 'kubectl get nodes' on the master to see this machine join.

Unfortunately, running kubectl get nodes on the master only returns itself as a node. The only interesting thing I see in /var/log/syslog is

Oct 27 21:19:28 ip-10-252-39-25 kubelet[19972]: E1027 21:19:28.198736   19972 eviction_manager.go:162] eviction manager: unexpected err: failed GetNode: node 'ip-10-253-130-44' not found
Oct 27 21:19:31 ip-10-252-39-25 kubelet[19972]: E1027 21:19:31.778521   19972 kubelet_node_status.go:301] Error updating node status, will retry: error getting node "ip-10-253-130-44": nodes "ip-10-253-130-44" not found

I am really not sure where to look...

like image 812
Paul Becotte Avatar asked Oct 27 '16 21:10

Paul Becotte


People also ask

Why can't kubeadm join to a cluster created by bootstrap-token users?

In v1.18 kubeadm added prevention for joining a Node in the cluster if a Node with the same name already exists. This required adding RBAC for the bootstrap-token user to be able to GET a Node object. However this causes an issue where kubeadm join from v1.18 cannot join a cluster created by kubeadm v1.17.

What is the difference between kubeadm CLI and kubelet?

The lifecycle of the kubeadm CLI tool is decoupled from the kubelet, which is a daemon that runs on each node within the Kubernetes cluster. The kubeadm CLI tool is executed by the user when Kubernetes is initialized or upgraded, whereas the kubelet is always running in the background.

How to create a Kubernetes kubelet from a failed node?

From a working control plane node in the cluster that has /etc/kubernetes/pki/ca. key execute kubeadm kubeconfig user --org system:nodes --client-name system:node:$NODE > kubelet. conf. $NODE must be set to the name of the existing failed node in the cluster. Modify the resulted kubelet.

Where does kubeadm initialize kubelet?

When you call kubeadm init, the kubelet configuration is marshalled to disk at /var/lib/kubelet/config.yaml, and also uploaded to a ConfigMap in the cluster. The ConfigMap is named kubelet-config-1.X, where X is the minor version of the Kubernetes version you are initializing.


1 Answers

The Hostnames of the two machines (master and the node) should be different. You can check them by running cat /etc/hostname. If they do happen to be the same, edit that file to make them different and then do a sudo reboot to apply the changes. Otherwise kubeadm will not be able to differentiate between the two machines and it will show as a single one in kubectl get nodes.

like image 92
hitman__o_o Avatar answered Oct 31 '22 08:10

hitman__o_o