I have the below Horizontal Pod Autoscaller configuration on Google Kubernetes Engine to scale a deployment by a custom metric - RabbitMQ messages ready count
for a specific queue: foo-queue
.
It picks up the metric value correctly.
When inserting 2 messages it scales the deployment to the maximum 10 replicas. I expect it to scale to 2 replicas since the targetValue is 1 and there are 2 messages ready.
Why does it scale so aggressively?
HPA configuration:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: foo-hpa
namespace: development
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: foo
minReplicas: 1
maxReplicas: 10
metrics:
- type: External
external:
metricName: "custom.googleapis.com|rabbitmq_queue_messages_ready"
metricSelector:
matchLabels:
metric.labels.queue: foo-queue
targetValue: 1
According to https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
From the most basic perspective, the Horizontal Pod Autoscaler controller operates on the ratio between desired metric value and current metric value:
desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]
From the above I understand that as long as the queue has messages the k8 HPA will continue to scale up since currentReplicas
is part of the desiredReplicas
calculation.
For example if:
currentReplicas
= 1
currentMetricValue
/ desiredMetricValue
= 2/1
then:
desiredReplicas
= 2
If the metric stay the same in the next hpa cycle currentReplicas
will become 2 and desiredReplicas
will be raised to 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With