I have a GCP managed instance group that I want to scale out and in between 0 and 1 instances using a cron schedule. GCP has a limitation that says:
Scaling schedules can only be used for MIGs that have at least one other type of autoscaling signal, such as a signal for scaling based on average CPU utilization, load balancing serving capacity, or Cloud Monitoring metrics.
So I must specify an additional autoscaling signal. The documentation goes on to suggest a workaround:
to scale based only on a schedule, you can set your CPU utilization target to 100%.
So I did. But then the managed group does not scale in to 0, it just stays at 1.
I've not used the Scale-in controls, so the only thing AFAICT that can prevent scale in is the 10 minute Stabilization period, which I have accounted for.
My autoscaler configuration:
{
"name":"myname",
"target":"the/url",
"autoscalingPolicy":{
"minNumReplicas":0,
"maxNumReplicas":1,
"scalingSchedules":{
"out":{
"minRequiredReplicas":1,
"schedule":"0,20,40 * * * *",
"durationSec":300,
"description":"scale out"
}
},
"cpuUtilization":{
"utilizationTarget":1
}
}
}
The schedule itself sets 5 minutes of scale-out to 1 instance, and then there are 10 minutes of stabilization, and then scale in to 0 should happen, but it does not.
If I use the same configuration, but only change maxNumReplicas=2 and minRequiredReplica=2, the autoscaler does scale in and out at the expected times, but between 1 and 2 instances. I think this means the schedule itself is fine.
My theory is that cpuUtilization signal prevents scaling in to 0. Is there a way I could scale between 0 and 1 on a schedule? perhaps another signal, not cpuUtilization?
Thanks!
Update:
The limitation that an additional autoscaling signal must be specified with scaling schedules is now gone and it is now possible to configure a schedule that would alternate between 0 and 1 instances (but see the general answer below).
When it is possible to scale to 0 instances:
In particular it is not possible to scale to 0 when one of autoscaling signals is CPU utilization, LB utilization or per-instance Cloud Monitoring metrics.
You are allowing auto scaling after CPU utilization reaches 100% (Autoscaling Policy). Because of that performance will be impacted. So you can set the policy between 60% to 90%.
Minimum number of instances (minNumReplicas) for instance groups with/without auto scaling should be 1, So Scale In at 0 is not possible.
For other signals/metrics also (HTTP Load balancing utilization, Stackdriver Monitoring Metric) Scale In at 0 is not possible.
Use Scale In controls. It helps if sudden load spikes occur.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With