Kubeadm why does my node not show up though kubelet says it joined?

Tags:

I am setting up a Kubernetes deployment using auto-scaling groups and Terraform. The kube master node is behind an ELB to get some reliability in case of something going wrong. The ELB has the health check set to tcp 6443, and tcp listeners for 8080, 6443, and 9898. All of the instances and the load balancer belong to a security group that allows all traffic between members of the group, plus public traffic from the NAT Gateway address. I created my AMI using the following script (from the getting started guide)...

# curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
# cat <<EOF > /etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
# apt-get update
# # Install docker if you don't have it already.
# apt-get install -y docker.io
# apt-get install -y kubelet kubeadm kubectl kubernetes-cni

I use the following user data scripts...

kube master

#!/bin/bash
rm -rf /etc/kubernetes/*
rm -rf /var/lib/kubelet/*

kubeadm init \
  --external-etcd-endpoints=http://${etcd_elb}:2379 \
  --token=${token} \
  --use-kubernetes-version=${k8s_version} \
  --api-external-dns-names=kmaster.${master_elb_dns} \
  --cloud-provider=aws
until kubectl cluster-info
do
  sleep 1
done
kubectl apply -f https://git.io/weave-kube

kube node

#!/bin/bash
rm -rf /etc/kubernetes/*
rm -rf /var/lib/kubelet/*

until kubeadm join --token=${token} kmaster.${master_elb_dns}
do
  sleep 1
done

Everything seems to work properly. The master comes up and responds to kubectl commands, with pods for discovery, dns, weave, controller-manager, api-server, and scheduler. kubeadm has the following output on the node...

Running pre-flight checks
<util/tokens> validating provided token
<node/discovery> created cluster info discovery client, requesting info from "http://kmaster.jenkins.learnvest.net:9898/cluster-info/v1/?token-id=eb31c0"
node/discovery> failed to request cluster info, will try again: [Get http://kmaster.jenkins.learnvest.net:9898/cluster-info/v1/?token-id=eb31c0: EOF]
<node/discovery> cluster info object received, verifying signature using given token
<node/discovery> cluster info signature and contents are valid, will use API endpoints [https://10.253.129.106:6443]
<node/bootstrap> trying to connect to endpoint https://10.253.129.106:6443
<node/bootstrap> detected server version v1.4.4
<node/bootstrap> successfully established connection with endpoint https://10.253.129.106:6443
<node/csr> created API client to obtain unique certificate for this node, generating keys and certificate signing request
<node/csr> received signed certificate from the API server:
Issuer: CN=kubernetes | Subject: CN=system:node:ip-10-253-130-44 | CA: false
Not before: 2016-10-27 18:46:00 +0000 UTC Not After: 2017-10-27 18:46:00 +0000 UTC
<node/csr> generating kubelet configuration
<util/kubeconfig> created "/etc/kubernetes/kubelet.conf"

Node join complete:
* Certificate signing request sent to master and response
  received.
* Kubelet informed of new secure connection details.

Run 'kubectl get nodes' on the master to see this machine join.

Unfortunately, running kubectl get nodes on the master only returns itself as a node. The only interesting thing I see in /var/log/syslog is

Oct 27 21:19:28 ip-10-252-39-25 kubelet[19972]: E1027 21:19:28.198736   19972 eviction_manager.go:162] eviction manager: unexpected err: failed GetNode: node 'ip-10-253-130-44' not found
Oct 27 21:19:31 ip-10-252-39-25 kubelet[19972]: E1027 21:19:31.778521   19972 kubelet_node_status.go:301] Error updating node status, will retry: error getting node "ip-10-253-130-44": nodes "ip-10-253-130-44" not found

I am really not sure where to look...

812

asked Oct 27 '16 21:10

Paul Becotte

1 Answers

The Hostnames of the two machines (master and the node) should be different. You can check them by running cat /etc/hostname. If they do happen to be the same, edit that file to make them different and then do a sudo reboot to apply the changes. Otherwise kubeadm will not be able to differentiate between the two machines and it will show as a single one in kubectl get nodes.

answered Oct 31 '22 08:10

hitman__o_o

Related questions
                            
                                Where to store RSA private key for a (Spring Boot) Java AWS program
                            
                                Autoscaling with AWS instances using Ansible-Pull with Master and slave Configuration
                            
                                AWS S3 getting presigned URL Guzzle "argument must be of type array, object given" when it's an array
                            
                                Why use multi AZ deployment for AWS Aurora
                            
                                Cannot Deploy Sample Alexa Skill using Lambda
                            
                                how to send email from a website hosted on s3
                            
                                How can I make Go accept a self-signed certificate for TLS client authentication?
                            
                                AWS lambda java response does not support nested objects?
                            
                                configure Amazon s3 bucket to run Lambda function created in another account
                            
                                AWS API Gateway - How to create a post method and test it
                            
                                AWS Codedeploy fails in DownloadBundle event saying No such file or directory
                            
                                AWS logs agent setup
                            
                                Ansible **sporadically** fails with host unreachable - Failed to connect to the host via ssh
                            
                                AWS CloudSearch - Getting results of a search in JSON format
                            
                                Returning a PDF from S3 in Flask
                            
                                AWS API Gateway Custom Authorization header case sensitivity
                            
                                AWS code deploy + bitbucket = failed (Error code HEALTH_CONSTRAINTS)
                            
                                Images do not display on a static page hosted on AWS S3
                            
                                AWS IAM: Allow EC2 instance to stop itself
                            
                                How to build recommendation system on Amazon Machine Learning

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Kubeadm why does my node not show up though kubelet says it joined?

Tags:

amazon-web-services

kubernetes

terraform