Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I configure managed instance group and autoscaling in Google Cloud Platform

Autoscaling helps you to automatically add or remove compute engines based on the load. The prerequisites to autoscaling in GCP are instance template and managed instance group.

This question is a part of another question's answer, which is about building an autoscaled and load-balanced backend.

I have written the below answer that contains the steps to set up autoscaling in GCP.

like image 798
Lakshman Diwaakar Avatar asked Jan 09 '17 05:01

Lakshman Diwaakar


People also ask

Is managed instance group in Google Cloud?

You cannot create a MIG with multiple subnets. Once created, you cannot change the network or subnetwork in a MIG. Shared VPC on interfaces other than nic0 for managed instance groups is supported in gcloud CLI and the API, but not in Google Cloud console.

Which types of managed instance groups MIG are available to use in the Google Cloud Platform?

You can create two types of MIGs: A zonal MIG, which deploys instances to a single zone. A regional MIG, which deploys instances to multiple zones across the same region.


1 Answers

Autoscaling is a feature of managed instance group in GCP. This helps to handle very high traffic by scaling up the instances and at the same time it also scales down the instances when there is no traffic, which saves a lot of money.

To set up autoscaling, we need the following:

  • Instance template
  • Managed Instance group
  • Autoscaling policy
  • Health Check

Instance template is a blueprint that defines the machine-type, image, disks of the homogeneous instances that will be running in the autoscaled, managed instance group. I have written the steps for setting up an instance template here.

Managed instance group helps in keeping a group of homogeneous instances that is based on a single instance template. Assuming the instance template as sample-template. This can be set up by running the following command in gcloud:

gcloud compute instance-groups managed \
create autoscale-managed-instance-group \
--base-instance-name autoscaled-instance \
--size 3 \
--template sample-template \
--region asia-northeast1

The above command creates a managed instance group containing 3 compute engines located in three different zones in asia-northeast1 region, based on the sample-template.

  • base-instance-name will be the base name for all the automatically created instances. In addition to the base name, every instance name will be appended by a uniquely generated random string.
  • size represents the desired number of instance in the group. As of now, 3 instances will be running all the time, irrespective of the amount of traffic generated by the application. Later, it can be autoscaled by applying a policy to this group.
  • region (multi-zone) or single-zone: Managed instance group can be either set up in a region (multi-zone) i.e the homogeneous instances will be evenly distributed across all the zones in a given region or all the instances can be deployed in the same zone within a region. It can also be deployed as cross region one, which is currently in alpha.

Autoscaling policy determines the autoscaler behaviour. The autoscaler aggregates data from the instances and compares it with the desired capacity as specified in the policy and determines the action to be taken. There are many auto-scaling policies like:

  • Average CPU Utilization

  • HTTP load balancing serving capacity (requests / second)

  • Stackdriver standard and custom metrics

  • and many more

Now, Introducing Autoscaling to this managed instance group by running the following command in gcloud:

gcloud compute instance-groups managed \
set-autoscaling \
autoscale-managed-instance-group \
--max-num-replicas 6 \
--min-num-replicas 2 \
--target-cpu-utilization 0.60 \
--cool-down-period 120 \
--region asia-northeast1

The above command sets up an autoscaler based on CPU utilization ranging from 2 (in case of no traffic) to 6 (in case of heavy traffic).

  • cool-down-period flag specifies the number of seconds to wait after a instance has been started before the associated autoscaler starts to collect information from it.
  • An autoscaler can be associated to an maximum of 5 different policies. In case of more than one policy, Autoscaler recommends the policy that leaves with the maximum number of instances.
  • Interesting fact: when an instance is spun up by the autoscaler, it makes sure that the instance runs for atleast 10 minutes irrespective of the traffic. This is done because GCP bills for a minimum of ten minute running time for the compute engine. It also protects against erratic spinning up and shutting down of instances.

Best Practices: From my perspective, it is better to create a custom image with all your software installed than to use a startup script. As the time taken to launch new instances in the autoscaling group should be as minimum as possible. This will increase the speed at which you scale your web app.

This is part 2 of 3-part series about building an autoscaled and load-balanced backend.

like image 156
Lakshman Diwaakar Avatar answered Sep 23 '22 08:09

Lakshman Diwaakar