Google Compute Engine auto scaling based on queue length

Question

We host our infrastructure on Google Compute Engine and are looking into Autoscaling for groups of instances. We do a lot of batch processing of binary data from a queue. In our case, this means:

When a worker is processing data the CPU is always 100%
When the queue is empty we want to terminate all workers
Depending on the length of the queue we want a certain amount of workers

However I'm finding it hard to figure out a way to auto-scale this on Google Compute Engine because they appear to scale on instance-only metrics such as CPU. From the documentation:

Not all custom metrics can be used by the autoscaler. To choose a valid custom metric, the metric must have all of the following properties:

The metric must be a per-instance metric.

The metric must be a valid utilization metric, which means that data from the metric can be used to proportionally scale up or down the number of virtual machines.

If I'm reading the documentation properly this makes it hard to use the auto scaling on a global queue length?

Backup solutions

Write a simple auto-scale handler using the Google Cloud API to create or destroy new workers using Instances API
Write a simple auto-scale handler using instance groups and then manually insert/remove instances using the InstanceGroups: insert
Write a simple auto-scaling handler using InstangeGroupManagers: resize
Create a custom per-instance metric which measures len(queue)/len(workers) on all workers

Niklas B · Accepted Answer

As of February 2018 (Beta) this is possible via "Per-group metrics" in stackdriver.

Per-group metrics allow autoscaling with a standard or custom metric that does not export per-instance utilization data. Instead, the group scales based on a value that applies to the whole group and corresponds to how much work is available for the group or how busy the group is. The group scales based on the fluctuation of that group metric value and the configuration that you define.

More information at https://cloud.google.com/compute/docs/autoscaler/scaling-stackdriver-monitoring-metrics#per_group_metrics

The how-to is too long to post here.

Grzenio · Answer

As far as I understand this is not implemented yet (as at January 2016). At the moment autoscaling is only targeted at web serving scenarios, where you want to serve web pages/other web services from your machines and keep some reasonable headroom (e.g. in terms of CPU or other metrics) for spikes in traffic. Then the system will adjust the number of instances/VMs to match your target.

You are looking for autoscaling for batch processing scenarios, and this is not catered for at the moment.

Google Compute Engine auto scaling based on queue length

Tags:

google-compute-engine

google-cloud-platform

Niklas B

2 Answers

Niklas B

Grzenio

Recent Activity

Donate For Us

Google Compute Engine auto scaling based on queue length

Tags:

google-compute-engine

google-cloud-platform

Niklas B

2 Answers

Niklas B

Grzenio

Related questions

Recent Activity

Donate For Us