How Kubernetes computes CPU utilization for HPA?

Question

I want to understand how HPA computes CPU utilization across Pods.

According to this doc it takes the average of CPU utilization of a pod (average across the last 1 minute) divided by the CPU requested by the pod. Then it computes the arithmetic mean of all the pods' CPU.

Unfortunately the doc contains some information that are outdated like for example that --horizontal-pod-autoscaler-sync-period is by default set to 30 seconds but in the official doc, the default value is 15 seconds.

When I tested, I noticed that HPA scales up even before that average CPU reaches the threshold I set (which is 90%), Which made me think that maybe it takes the maximum CPU across Pods and not the average.

enter image description here

My question is where I can find an updated documentation to understand exactly how HPA works?

Margaret Bloom · Accepted Answer

Note that I've not a Kubernetes cluster at hand, this is a theoretical answer based on the source code of k8s.
See if this actually matches your experience.

Kubernetes is opensource, here seems to be the HPA code.

The functions GetResourceReplica and calcPlainMetricReplicas (for non-utilization percentage) compute the number of replicas given the current metrics.
Both use the usageRatio returned by GetMetricUtilizationRatio, this value is multiplied by the number of currently ready pods in the Replica to get the new number of pods:

New_number_of_pods = Old_numbers_of_ready_pods * usageRatio

There is a tolerance check (ie if the usageRatio falls close enough to 1, nothing is done) and the pending and unkown-state pods are ignored (considered to use 0% of the resource) while the pods without metrics are considered to use 100% of the resource.

The usageRatio is computed by GetResourceUtilizationRatio that is passed the metrics and the requests (of resources) of all the pods, it goes as follow:

utilization = Total_sum_resource_usage_all_pods / Total_sum_resource_requests_all_pods
usageRatio = utilization * 100 / targetUtilization

Where targetUtilization comes from the HPA spec.
The code is easier to read than this summary of mine, in this context the term request means "resource request" (that's an educated guess).

So I'd say that 90% is the resource usage across all pods computed as they were all a single pod requesting the sum of each pod's request and collecting the metrics as they were all running on a single dedicated node.

How Kubernetes computes CPU utilization for HPA?

Tags:

cpu

kubernetes

autoscaling

hpa

Mouna

1 Answers

Margaret Bloom

Recent Activity

Donate For Us

How Kubernetes computes CPU utilization for HPA?

Tags:

cpu

kubernetes

autoscaling

hpa

Mouna

1 Answers

Margaret Bloom

Related questions

Recent Activity

Donate For Us