Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HPA Scaling even though Current CPU is below Target CPU

I am playing around with the Horizontal Pod Autoscaler in Kubernetes. I've set the HPA to start up new instances once the average CPU Utilization passes 35%. However this does not seem to work as expected. The HPA triggers a rescale even though the CPU Utilization is far below the defined target utilization. As seen below the "current" utilization is 10% which is far away from 35%. But still, it rescaled the number of pods from 5 to 6. enter image description here

I've also checked the metrics in my Google Cloud Platform dashboard (the place at which we host the application). This also shows me that the requested CPU utilization hasn't surpassed the threshold of 35%. But still, several rescales occurred. enter image description here

The content of my HPA

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
 name: django
spec:
{{ if eq .Values.env "prod" }}
 minReplicas: 5
 maxReplicas: 35
{{ else if eq .Values.env "staging" }}
 minReplicas: 1
 maxReplicas: 3
{{ end }}
 scaleTargetRef:
   apiVersion: apps/v1
   kind: Deployment
   name: django-app
 targetCPUUtilizationPercentage: 35

Does anyone know what the cause of this might be?

like image 945
Jeroen Beljaars Avatar asked Oct 26 '25 03:10

Jeroen Beljaars


1 Answers

Scaling is based on % of requests not limits. I think we should change this answer as the examples in the accepted answer show:

 limits:
   cpu: 1000m

But the targetCPUUtilizationPercentage is based on requests like:

requests:
   cpu: 1000m

For per-pod resource metrics (like CPU), the controller fetches the metrics from the resource metrics API for each Pod targeted by the HorizontalPodAutoscaler. Then, if a target utilization value is set, the controller calculates the utilization value as a percentage of the equivalent resource request on the containers in each Pod. If a target raw value is set, the raw metric values are used directly. The controller then takes the mean of the utilization or the raw value (depending on the type of target specified) across all targeted Pods, and produces a ratio used to scale the number of desired replicas.

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#how-does-a-horizontalpodautoscaler-work

like image 157
Drew Avatar answered Oct 28 '25 03:10

Drew



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!