Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Kubernetes' scheduler work?

Tags:

kubernetes

How does Kubernetes' scheduler work? What I mean is that Kubernetes' scheduler appears to be very simple?

My initial thought is that this scheduler is just a simple admission control system, not a real scheduler. Is it that correct?

I found a short description, but it is not terribly informative:

The kubernetes scheduler is a policy-rich, topology-aware, workload-specific function that significantly impacts availability, performance, and capacity. The scheduler needs to take into account individual and collective resource requirements, quality of service requirements, hardware/software/policy constraints, affinity and anti-affinity specifications, data locality, inter-workload interference, deadlines, and so on. Workload-specific requirements will be exposed through the API as necessary.

like image 487
Halacs Avatar asked Mar 04 '15 15:03

Halacs


People also ask

How does Kubernetes scheduler work stack overflow?

The scheduler is not just an admission controller; for each pod that is created, it finds the "best" machine for that pod, and if no machine is suitable, the pod remains unscheduled until a machine becomes suitable. The scheduler is configurable.

What is the role of Apiserver and scheduler?

API Server validates the request and persists it to etcd. etcd notifies back the API Server. API Server invokes the Scheduler. Scheduler decides where to run the pod on and return that to the API Server.

What is Kubernetes scheduling and how does it work?

Kubernetes scheduling is simply the process of assigning pods to the matched nodes in a cluster. A scheduler watches for newly created pods and finds the best node for their assignment. It chooses the optimal node based on Kubernetes’ scheduling principles and your configuration options.

Does Kubernetes scheduler have a “noisy neighbors”?

Although Kubernetes Scheduler is designed to select the best node, the “best node” can change after the pods start running. As a result, there are potential issues with pods’ resource usages and their node assignments over the long run. “Noisy neighbors” are not specific to Kubernetes. Any multitenant system is a potential residency for them.

What are taints and tolerations in Kubernetes scheduler?

Taints and tolerations work together to make Kubernetes Scheduler dedicate some nodes and assign only specific pods. Although Kubernetes Scheduler is designed to select the best node, the “best node” can change after the pods start running.

What is a Kubernetes pod?

A Kubernetes pod is comprised of one or more containers with shared storage and network resources. The Kubernetes scheduler's task is to ensure that each pod is assigned to a node to run on. At a high level, here is how the Kubernetes scheduler works:


1 Answers

The paragraph you quoted describes where we hope to be in the future (where the future is defined in units of months, not years). We're not there yet, but the scheduler does have a number of useful features already, enough for a simple deployment. In the rest of this reply, I'll explain how the scheduler works today.

The scheduler is not just an admission controller; for each pod that is created, it finds the "best" machine for that pod, and if no machine is suitable, the pod remains unscheduled until a machine becomes suitable.

The scheduler is configurable. It has two types of policies, FitPredicate (see master/pkg/scheduler/predicates.go) and PriorityFunction (see master/pkg/scheduler/priorities.go). I'll describe them.

Fit predicates are required rules, for example the labels on the node must be compatible with the label selector on the pod (this rule is implemented in PodSelectorMatches() in predicates.go), and the sum of the requested resources of the container(s) already running on the machine plus the requested resources of the new container(s) you are considering scheduling onto the machine must not be greater than the capacity of the machine (this rule is implemented in PodFitsResources() in predicates.go; note that "requested resources" is defined as pod.Spec.Containers[n].Resources.Limits, and if you request zero resources then you always fit). If any of the required rules are not satisfied for a particular (new pod, machine) pair, then the new pod is not scheduled on that machine. If after checking all machines the scheduler decides that the new pod cannot be scheduled onto any machine, then the pod remains in Pending state until it can be satisfied by one of the machines.

After checking all of the machines with respect to the fit predicates, the scheduler may find that multiple machines "fit" the pod. But of course, the pod can only be scheduled onto one machine. That's where priority functions come in. Basically, the scheduler ranks the machines that meet all of the fit predicates, and then chooses the best one. For example, it prefers the machine whose already-running pods consume the least resources (this is implemented in LeastRequestedPriority() in priorities.go). This policy spreads pods (and thus containers) out instead of packing lots onto one machine while leaving others empty.

When I said that the scheduler is configurable, I mean that you can decide at compile time which fit predicates and priority functions you want Kubernetes to apply. Currently, it applies all of the ones you see in predicates.go and priorities.go.

like image 56
DavidO Avatar answered Sep 29 '22 11:09

DavidO