We have auto scaling groups for one of our cloud formation stacks that has a CPU based alarm for determining when to scale the instances.
This is great but we recently had it scale up from one node to three and one of those nodes failed to bootstrap via cfn-init. Once the workload reduced and the group scaled back down to one node it killed the two good instances and left the partially bootstrapped node as the only remaining instance. This meant that we stopped processing work until someone logged in and re-ran the bootstrap process.
Obviously this is not ideal. What is the best way to notify the auto scaling group that a node is not healthy when it does not sit behind an ELB?
Since this is just initial bootstrap what I'd really like is to communicate back to the auto scaling group that this node failed and have it terminated and a new node spun up in its place.
You don't have to use ELB to use Auto Scaling. You can use the EC2 health check to identify and replace unhealthy instances.
The ELB allows you to dynamically manage loads across your resources based upon target groups and rules whereas EC2 auto scaling allows you to elastically scale those target groups based upon the demand put upon your infrastructure.
When Amazon EC2 Auto Scaling determines that an InService instance is unhealthy, it terminates the instance while it launches a new replacement instance. The new instance launches using the current settings of the Auto Scaling group and its associated launch template or launch configuration.
If you configure the Auto Scaling group to use Elastic Load Balancing health checks, it considers the instance unhealthy if it fails either the EC2 status checks or the Elastic Load Balancing health checks.
A colleague just showed me http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/as-configure-healthcheck.html which looks handy.
If you have your own health check system, you can use the information from your health check system to set the health state of the instances in the Auto Scaling group.
UPDATE - I managed to get this working during launch.
Here's what my UserData section for the ASG looks like:
#!/bin/bash -v
set -x
export AWS_DEFAULT_REGION=us-west-1
cfn-init --region us-west-1 --stack bapi-prod --resource LaunchConfiguration -v
if [[ $? -ne 0 ]]; then
export INSTANCE=`curl http://169.254.169.254/latest/meta-data/instance-id`
aws autoscaling set-instance-health \
--instance-id $INSTANCE \
--health-status Unhealthy
fi
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With