While following the kubernetes article on Using kubeadm to Create a Cluster, I was stuck when the AddOn pods I was trying to install (Nginx, Tiller, Grafana, InfluxDB, Dashboard) would always stay in a state of Pending.
Checking the message from kubectl describe pod tiller-deploy-df4fdf55d-jwtcz --namespace=kube-system
resulted in the following message:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 51s (x15 over 3m) default-scheduler 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
When I ran the command from the Master Isolation section kubectl taint nodes --all node-role.kubernetes.io/master-
, the AddOns would install as expected.
At this point I can only suspect (because they are already installed on the master node) that the reason was that I hadn't connected a worker node to the cluster yet for the scheduler to schedule the pods on.
The documentation states "your cluster will not schedule pods on the master for security reasons". I know that this is a non-production environment so there is little risk in this situation but what is the risk of removing that taint in a production cluster?
Follow-up: If this is a risk, how can I re-add that taint so I can then uninstall the AddOn pods and try to have the scheduler install them on my Worker Node?
Environment Details: Operating System - CentOS 7.4.1708 (Core) Kubernetes Version - 1.10
My pod stays pending If a Pod is stuck in Pending it means that it can not be scheduled onto a node. Generally this is because there are insufficient resources of one type or another that prevent scheduling.
2. kube-scheduler. This component is responsible for scheduling pods on specific nodes according to automated workflows and user defined conditions, which can include resource requests, concerns like affinity and taints or tolerations, priority, persistent volumes (PV), and more.
the reason was that I hadn't connected a worker node to the cluster yet for the scheduler to schedule the pods on.
100% correct. You will for sure want some worker nodes, otherwise the idea of "scheduling work" becomes very weird.
but what is the risk of removing that taint in a production cluster?
I am not a kubernetes security expert, but a pragmatic risk is CPU, I/O, and/or memory exhaustion on the master nodes, which would have very severe consequences to the health of the cluster. There is almost never a reason to run any workload on a master node, and almost entirely an increase in risk, so the advice "just don't do it" is well founded.
how can I re-add that taint so I can then uninstall the AddOn pods and try to have the scheduler install them on my Worker Node?
I'm not sure I follow that question, but I would for sure start by just adding a worker node before trying to do complicated stuff with taints and tolerations.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With