Coming from numerous years of running node/rails apps on bare metal; i was used to be able to run as many apps as i wanted on a single machine (let's say, a 2Go at digital ocean could easily handle 10 apps without worrying, based on correct optimizations or fairly low amount of traffic)
Thing is, using kubernetes, the game sounds quite different. I've setup a "getting started" cluster with 2 standard vm (3.75Go).
Assigned a limit on a deployment with the following :
resources:
requests:
cpu: "64m"
memory: "128Mi"
limits:
cpu: "128m"
memory: "256Mi"
Then witnessing the following :
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
default api 64m (6%) 128m (12%) 128Mi (3%) 256Mi (6%)
What does this 6% refers to ?
Tried to lower the CPU limit, to like, 20Mi… the app does to start (obviously, not enough resources). The docs says it is percentage of CPU. So, 20% of 3.75Go machine ? Then where this 6% comes from ?
Then increased the size of the node-pool to the n1-standard-2, the same pod effectively span 3% of node. That sounds logical, but what does it actually refers to ?
Still wonder what is the metrics to be taken in account for this part.
The app seems to need a large amount of memory on startup, but then it use only a minimal fraction of this 6%. I then feel like I misunderstanding something, or misusing it all
Thanks for any experienced tips/advices to have a better understanding Best
CPU is considered a “compressible” resource. If your app starts hitting your CPU limits, Kubernetes starts throttling your container. This means the CPU will be artificially restricted, giving your app potentially worse performance! However, it won't be terminated or evicted.
If a container attempts to exceed the specified limit, the system will throttle the container.
If you have a 2 node cluster and the first node has 2 cores and second node has 1 core, K8s CPU capacity will be 3 cores (2 core + 1 core). If you have a pod which requests 1.5 cores, then it will not be scheduled to the second node, as that node has a capacity of only 1 core.
According to the docs, CPU requests (and limits) are always fractions of available CPU cores on the node that the pod is scheduled on (with a resources. requests. cpu of "1" meaning reserving one CPU core exclusively for one pod). Fractions are allowed, so a CPU request of "0.5" will reserve half a CPU for one pod.
According to the docs, CPU requests (and limits) are always fractions of available CPU cores on the node that the pod is scheduled on (with a resources.requests.cpu
of "1"
meaning reserving one CPU core exclusively for one pod). Fractions are allowed, so a CPU request of "0.5"
will reserve half a CPU for one pod.
For convenience, Kubernetes allows you to specify CPU resource requests/limits in millicores:
The expression
0.1
is equivalent to the expression100m
, which can be read as “one hundred millicpu” (some may say “one hundred millicores”, and this is understood to mean the same thing when talking about Kubernetes). A request with a decimal point, like0.1
is converted to100m
by the API, and precision finer than1m
is not allowed.
As already mentioned in the other answer, resource requests are guaranteed. This means that Kubernetes will schedule pods in a way that the sum of all requests will not exceed the amount of resources actually available on a node.
So, by requesting 64m
of CPU time in your deployment, you are requesting actually 64/1000 = 0,064 = 6,4% of one of the node's CPU cores time. So that's where your 6% come from. When upgrading to a VM with more CPU cores, the amount of available CPU resources increases, so on a machine with two available CPU cores, a request for 6,4% of one CPU's time will allocate 3,2% of the CPU time of two CPUs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With