Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple node pools vs single pool with many machines vs big machines

We're moving all of our infrastructure to Google Kubernetes Engine (GKE) - we currently have 50+ AWS machines with lots of APIs, Services, Webapps, Database servers and more.

As we have already dockerized everything, it's time to start moving everything to GKE.

I have a question that may sound too basic, but I've been searching the Internet for a week and did not found any reasonable post about this

Straight to the point, which of the following approaches is better and why:

  1. Having multiple node pools with multiple machine types and always specify in which pool each deployment should be done; or

  2. Having a single pool with lots of machines and let Kubernetes scheduler do the job without worrying about where my deployments will be done; or

  3. Having BIG machines (in multiple zones to improve clusters' availability and resilience) and let Kubernetes deploy everything there.

like image 454
stefanobaldo Avatar asked Mar 20 '18 03:03

stefanobaldo


People also ask

Why are there multiple node pools?

User node pools are where you place your application-specific pods. For example, use these additional user node pools to provide GPUs for compute-intensive applications, or access to high-performance SSD storage. This feature enables higher control over how to create and manage multiple node pools.

How many node pools do I need?

System node pools Must be running Linux. They can have a minimum of 1 node, but it is recommended to have 2 nodes or 3 if it is your only Linux node pool. They only support AKS cluster running on Virtual Machine Scale Sets.

How many node pools are there?

How many node pools can I create? The AKS cluster can have a maximum of 100 node pools. You can have a maximum of 1,000 nodes across those node pools.

What is a node pool?

A node pool is a group of nodes within a cluster that all have the same configuration. Node pools use a NodeConfig specification. Each node in the pool has a Kubernetes node label, cloud.google.com/gke-nodepool , which has the node pool's name as its value.


2 Answers

List of consideration to be taken merely as hints, I do not pretend to describe best practice.

  • Each pod you add brings with it some overhead, but you increase in terms of flexibility and availability making failure and maintenance of nodes to be less impacting the production.

  • Nodes too small would cause a big waste of resources since sometimes will be not possible to schedule a pod even if the total amount of free RAM or CPU across the nodes would be enough, you can see this issue similar to memory fragmentation.

  • I guess that the sizes of PODs and their memory and CPU request are not similar, but I do not see this as a big issue in principle and a reason to go for 1). I do not see why a big POD should run merely on big machines and a small one should be scheduled on small nodes. I would rather use 1) if you need a different memoryGB/CPUcores ratio to support different workloads.

I would advise you to run some test in the initial phase to understand which is the size of the biggest POD and the average size of the workload in order to properly chose the machine types. Consider that having 1 POD that exactly fit in one node and assign to it is not the right to proceed(virtual machine exist for this kind of scenario). Since fragmentation of resources would easily cause to impossibility to schedule a large node.

  • Consider that their size will likely increase in the future and to scale vertically is not always this immediate and you need to switch off machine and terminate pods, I would oversize a bit taking this issue into account and since scaling horizontally is way easier.

  • Talking about the machine type you can decide to go for a machine 5xsize the biggest POD you have (or 3x? or 10x?). Oversize a bit as well the numebr of nodes of the cluster to take into account overheads, fragmentation and in order to still have free resources.

    1. Remember that you have an hard limit of 100 pods each node and 5000 nodes.

    2. Remember that in GCP the network egress throughput cap is dependent on the number of vCPUs that a virtual machine instance has. Each vCPU has a 2 Gbps egress cap for peak performance. However each additional vCPU increases the network cap, up to a theoretical maximum of 16 Gbps for each virtual machine.

    3. Regarding the prices of the virtual machines notice that there is no difference in price buying two machines with size x or one with size 2x. Avoid to customise the size of machines because rarely is convenient, if you feel like your workload needs more cpu or mem go for HighMem or HighCpu machine type.

P.S. Since you are going to build a pretty big Cluster, check the size of the DNS

I will add any consideration that it comes to my mind, consider in the future to update your question with the description of the path you chose and the issue you faced.

like image 143
GalloCedrone Avatar answered Sep 29 '22 05:09

GalloCedrone


1) makes a lot of sense as if you want, you can still allow kube deployments treat it as one large pool (by not adding nodeSelector/NodeAffinity) but you can have different machines of different sizes, you can think about having a pool of spot instances, etc. And, after all, you can have pools that are tainted and so forth excluded from normal scheduling and available to only a particular set of workloads. It is in my opinion preferred to have some proficiency with this approach from the very beginning, yet in case of many provisioners it should be very easy to migrate from 2) to 1) anyway.

2) As explained above, it's effectively a subset of 1) so better to build up exp with 1) approach from day 1, but if you ensure your provisioning solution supports easy extension to 1) model then you can get away with starting with this simplified approach.

3) Big is nice, but "big" is relative. It depends on the requirements and amount of your workloads. Remember that while you need to plan for loss of a whole AZ anyway, it will be much more frequent to loose single nodes (reboots, decommissions of underlying hardware, updates etc.) so if you have more hosts, impact of loosing one will be smaller. Bottom line is that you need to find your own balance, that makes sense for your particular scale. Maybe 50 nodes is too much, would 15 cut it? Who knows but you :)

like image 29
Radek 'Goblin' Pieczonka Avatar answered Sep 28 '22 05:09

Radek 'Goblin' Pieczonka