We're moving all of our infrastructure to Google Kubernetes Engine (GKE) - we currently have 50+ AWS machines with lots of APIs, Services, Webapps, Database servers and more. As we have already dockerized everything, it's time to start moving everything to GKE. I have a question that may sound too basic, but I've been searching the Internet for a week and did not found any reasonable post about this Straight to the point, which of the following approaches is better and why: <ol> <li>Having multiple node pools with multiple machine types and always specify in which pool each deployment should be done; or</li> <li>Having a single pool with lots of machines and let Kubernetes scheduler do the job without worrying about where my deployments will be done; or</li> <li>Having BIG machines (in multiple zones to improve clusters' availability and resilience) and let Kubernetes deploy everything there.</li> </ol>

<h3>List of consideration to be taken merely as hints, I do not pretend to describe best practice.</h3> <ul> <li>Each pod you add brings with it some overhead, but you increase in terms of flexibility and availability making failure and maintenance of nodes to be less impacting the production. </li> <li>Nodes too small would cause a big waste of resources since sometimes will be not possible to schedule a pod even if the total amount of free RAM or CPU across the nodes would be enough, you can see this issue similar to memory fragmentation.</li> <li>I guess that the sizes of PODs and their memory and CPU request are not similar, but I do not see this as a big issue in principle and a reason to go for 1). I do not see why a big POD should run merely on big machines and a small one should be scheduled on small nodes. I would rather use 1) if you need a different memoryGB/CPUcores ratio to support different workloads. </li> </ul> I would advise you to run some test in the initial phase to understand which is the size of the biggest POD and the average size of the workload in order to properly chose the machine types. Consider that having 1 POD that exactly fit in one node and assign to it is not the right to proceed(virtual machine exist for this kind of scenario). Since fragmentation of resources would easily cause to impossibility to schedule a large node. <ul> <li>Consider that their size will likely increase in the future and to scale vertically is not always this immediate and you need to switch off machine and terminate pods, I would oversize a bit taking this issue into account and since scaling horizontally is way easier. </li> <li> Talking about the machine type you can decide to go for a machine 5xsize the biggest POD you have (or 3x? or 10x?). Oversize a bit as well the numebr of nodes of the cluster to take into account overheads, fragmentation and in order to still have free resources. <ol> <li><blockquote> Remember that you have an hard limit of 100 pods each node and 5000 nodes. </blockquote></li> <li><blockquote> Remember that in GCP the network egress throughput cap is dependent on the number of vCPUs that a virtual machine instance has. Each vCPU has a 2 Gbps egress cap for peak performance. However each additional vCPU increases the network cap, up to a theoretical maximum of 16 Gbps for each virtual machine. </blockquote></li> <li><blockquote> Regarding the prices of the virtual machines notice that there is no difference in price buying two machines with size x or one with size 2x. Avoid to customise the size of machines because rarely is convenient, if you feel like your workload needs more cpu or mem go for HighMem or HighCpu machine type. </blockquote></li> </ol> </li> </ul> P.S. Since you are going to build a pretty big Cluster, check the size of the DNS I will add any consideration that it comes to my mind, consider in the future to update your question with the description of the path you chose and the issue you faced.

Multiple node pools vs single pool with many machines vs big machines

2 Answers

List of consideration to be taken merely as hints, I do not pretend to describe best practice.

Each pod you add brings with it some overhead, but you increase in terms of flexibility and availability making failure and maintenance of nodes to be less impacting the production.
Nodes too small would cause a big waste of resources since sometimes will be not possible to schedule a pod even if the total amount of free RAM or CPU across the nodes would be enough, you can see this issue similar to memory fragmentation.
I guess that the sizes of PODs and their memory and CPU request are not similar, but I do not see this as a big issue in principle and a reason to go for 1). I do not see why a big POD should run merely on big machines and a small one should be scheduled on small nodes. I would rather use 1) if you need a different memoryGB/CPUcores ratio to support different workloads.

I would advise you to run some test in the initial phase to understand which is the size of the biggest POD and the average size of the workload in order to properly chose the machine types. Consider that having 1 POD that exactly fit in one node and assign to it is not the right to proceed(virtual machine exist for this kind of scenario). Since fragmentation of resources would easily cause to impossibility to schedule a large node.

Consider that their size will likely increase in the future and to scale vertically is not always this immediate and you need to switch off machine and terminate pods, I would oversize a bit taking this issue into account and since scaling horizontally is way easier.
Talking about the machine type you can decide to go for a machine 5xsize the biggest POD you have (or 3x? or 10x?). Oversize a bit as well the numebr of nodes of the cluster to take into account overheads, fragmentation and in order to still have free resources.
1. Remember that you have an hard limit of 100 pods each node and 5000 nodes.
2. Remember that in GCP the network egress throughput cap is dependent on the number of vCPUs that a virtual machine instance has. Each vCPU has a 2 Gbps egress cap for peak performance. However each additional vCPU increases the network cap, up to a theoretical maximum of 16 Gbps for each virtual machine.
3. Regarding the prices of the virtual machines notice that there is no difference in price buying two machines with size x or one with size 2x. Avoid to customise the size of machines because rarely is convenient, if you feel like your workload needs more cpu or mem go for HighMem or HighCpu machine type.

P.S. Since you are going to build a pretty big Cluster, check the size of the DNS

I will add any consideration that it comes to my mind, consider in the future to update your question with the description of the path you chose and the issue you faced.

143

answered Sep 29 '22 05:09

GalloCedrone

1) makes a lot of sense as if you want, you can still allow kube deployments treat it as one large pool (by not adding nodeSelector/NodeAffinity) but you can have different machines of different sizes, you can think about having a pool of spot instances, etc. And, after all, you can have pools that are tainted and so forth excluded from normal scheduling and available to only a particular set of workloads. It is in my opinion preferred to have some proficiency with this approach from the very beginning, yet in case of many provisioners it should be very easy to migrate from 2) to 1) anyway.

2) As explained above, it's effectively a subset of 1) so better to build up exp with 1) approach from day 1, but if you ensure your provisioning solution supports easy extension to 1) model then you can get away with starting with this simplified approach.

3) Big is nice, but "big" is relative. It depends on the requirements and amount of your workloads. Remember that while you need to plan for loss of a whole AZ anyway, it will be much more frequent to loose single nodes (reboots, decommissions of underlying hardware, updates etc.) so if you have more hosts, impact of loosing one will be smaller. Bottom line is that you need to find your own balance, that makes sense for your particular scale. Maybe 50 nodes is too much, would 15 cut it? Who knows but you :)

answered Sep 28 '22 05:09

Radek 'Goblin' Pieczonka

Related questions
                            
                                Prevent Spring Boot application closing until all current requests are finished
                            
                                Where are the possible metrics for kubernetes autoscaling defined
                            
                                Cannot determine if job needs to be started: Too many missed start time (> 100). Set or decrease .spec.startingDeadlineSeconds or check clock skew
                            
                                kubernetes add local files to pod
                            
                                Kustomize how to replace only the host in Ingress configuration
                            
                                gcloud ingress loadbalancer / static ip
                            
                                what does kubernetes use the container pause-amd64 for?
                            
                                Not able to see Kubernetes UI Dashboard
                            
                                How to exec into an init container?
                            
                                Why should I specify service before deployment in a single Kubernetes configuration file?
                            
                                Kubernetes how to access a service from another namespace
                            
                                How to Refresh Worker Secrets Without Killing Deployment?
                            
                                How to Route to specific pod through Kubernetes Service (like a Gateway API)
                            
                                kubectl get componentstatus shows unhealthy
                            
                                Can't delete pods in pending state?
                            
                                Mount Google storage bucket in Google container
                            
                                Assign an External IP to a Node
                            
                                kubernetes installation and kube-dns: open /run/flannel/subnet.env: no such file or directory
                            
                                I am trying to use gcs bucket as the volume in gke pod
                            
                                Kubernetes Ingress not adding the application URL for grafana dashboard

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Multiple node pools vs single pool with many machines vs big machines

Tags:

google-cloud-platform

kubernetes

google-kubernetes-engine

stefanobaldo

People also ask

2 Answers

List of consideration to be taken merely as hints, I do not pretend to describe best practice.

GalloCedrone

Radek 'Goblin' Pieczonka

Recent Activity

Donate For Us