Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kubeadm and the Risks of Scheduling Pods on Master Node (Pods always Pending)

While following the kubernetes article on Using kubeadm to Create a Cluster, I was stuck when the AddOn pods I was trying to install (Nginx, Tiller, Grafana, InfluxDB, Dashboard) would always stay in a state of Pending.

Checking the message from kubectl describe pod tiller-deploy-df4fdf55d-jwtcz --namespace=kube-system resulted in the following message:

Type     Reason            Age                From               Message
----     ------            ----               ----               -------
Warning  FailedScheduling  51s (x15 over 3m)  default-scheduler  0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.

When I ran the command from the Master Isolation section kubectl taint nodes --all node-role.kubernetes.io/master-, the AddOns would install as expected.

At this point I can only suspect (because they are already installed on the master node) that the reason was that I hadn't connected a worker node to the cluster yet for the scheduler to schedule the pods on.

The documentation states "your cluster will not schedule pods on the master for security reasons". I know that this is a non-production environment so there is little risk in this situation but what is the risk of removing that taint in a production cluster?

Follow-up: If this is a risk, how can I re-add that taint so I can then uninstall the AddOn pods and try to have the scheduler install them on my Worker Node?

Environment Details: Operating System - CentOS 7.4.1708 (Core) Kubernetes Version - 1.10

like image 680
Flea Avatar asked Apr 06 '18 13:04

Flea


People also ask

Why are my pods pending?

My pod stays pending If a Pod is stuck in Pending it means that it can not be scheduled onto a node. Generally this is because there are insufficient resources of one type or another that prevent scheduling.

Who is responsible for scheduling pods on nodes?

2. kube-scheduler. This component is responsible for scheduling pods on specific nodes according to automated workflows and user defined conditions, which can include resource requests, concerns like affinity and taints or tolerations, priority, persistent volumes (PV), and more.


1 Answers

the reason was that I hadn't connected a worker node to the cluster yet for the scheduler to schedule the pods on.

100% correct. You will for sure want some worker nodes, otherwise the idea of "scheduling work" becomes very weird.

but what is the risk of removing that taint in a production cluster?

I am not a kubernetes security expert, but a pragmatic risk is CPU, I/O, and/or memory exhaustion on the master nodes, which would have very severe consequences to the health of the cluster. There is almost never a reason to run any workload on a master node, and almost entirely an increase in risk, so the advice "just don't do it" is well founded.

how can I re-add that taint so I can then uninstall the AddOn pods and try to have the scheduler install them on my Worker Node?

I'm not sure I follow that question, but I would for sure start by just adding a worker node before trying to do complicated stuff with taints and tolerations.

like image 111
mdaniel Avatar answered Oct 05 '22 04:10

mdaniel