I'm working with a CloudFormation template that brings up as many instances as I request, and want to wait for them to finish initialising (via User Data) before the stack creation/update is considered complete.
Creating or updating the stack should wait for signals from all newly created instances, such to ensure that their initialisation is complete.
I don't want the stack creation or update to be considered successful if any of the created instances fail to initialise.
CloudFormation only seems to wait for signals from instances when the stack is first created. Updating the stack and increasing the number of instances seems to disregard signalling. The update operation finishes successfully very quickly, whilst instances are still being initialised.
Instances created as a result of updating the stack can fail to initialise, but the update action would've already been considered a success.
Using CloudFormation, how can I make the reality meet the expectation?
I want the same behaviour that applies when the stack is created, to when the stack is updated.
I have found only the following question that matches my problem: UpdatePolicy in Autoscaling group not working correctly for AWS CloudFormation update
It's been open for a year and has not received an answer.
I'm creating another question as I've more information to add, and I'm not sure if these particulars will match those of the author in that question.
To demonstrate the problem, I've created a template based off of the example beneath the Auto Scaling Group header on this AWS documentation page, which includes signalling.
The created template has been adapted as so:
ap-northeast-1
). The cfn-signal
command has been bootstrapped and called as necessary considering this change.Here's the template, saved to template.yml
:
Parameters:
DesiredCapacity:
Type: Number
Description: How many instances would you like in the Auto Scaling Group?
Resources:
AutoScalingGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
AvailabilityZones: !GetAZs ''
LaunchConfigurationName: !Ref LaunchConfig
MinSize: !Ref DesiredCapacity
MaxSize: !Ref DesiredCapacity
CreationPolicy:
ResourceSignal:
Count: !Ref DesiredCapacity
Timeout: PT5M
UpdatePolicy:
AutoScalingScheduledAction:
IgnoreUnmodifiedGroupSizeProperties: true
AutoScalingRollingUpdate:
MinInstancesInService: 1
MaxBatchSize: 2
PauseTime: PT5M
WaitOnResourceSignals: true
LaunchConfig:
Type: AWS::AutoScaling::LaunchConfiguration
Properties:
ImageId: ami-b7d829d6
InstanceType: t2.micro
UserData:
'Fn::Base64':
!Sub |
#!/bin/bash -xe
sleep 120
apt-get -y install python-setuptools
TMP=`mktemp -d`
curl https://s3.amazonaws.com/cloudformation-examples/aws-cfn-bootstrap-latest.tar.gz | \
tar xz -C $TMP --strip-components 1
easy_install $TMP
/usr/local/bin/cfn-signal -e $? \
--stack ${AWS::StackName} \
--resource AutoScalingGroup \
--region ${AWS::Region}
Now I create the stack with a single instance, via:
$ aws cloudformation create-stack \
--region=ap-northeast-1 \
--stack-name=asg-test \
--template-body=file://template.yml \
--parameters ParameterKey=DesiredCapacity,ParameterValue=1
After waiting a few minutes for the creation to complete, let's look some key stack events:
$ aws cloudformation describe-stack-events \
--region=ap-northeast-1 \
--stack-name=asg-test
...
{
"Timestamp": "2017-02-03T05:36:45.445Z",
...
"LogicalResourceId": "AutoScalingGroup",
...
"ResourceStatus": "CREATE_COMPLETE",
...
},
{
"Timestamp": "2017-02-03T05:36:42.487Z",
...
"LogicalResourceId": "AutoScalingGroup",
...
"ResourceStatusReason": "Received SUCCESS signal with UniqueId ...",
"ResourceStatus": "CREATE_IN_PROGRESS"
},
{
"Timestamp": "2017-02-03T05:33:33.274Z",
...
"LogicalResourceId": "AutoScalingGroup",
...
"ResourceStatusReason": "Resource creation Initiated",
"ResourceStatus": "CREATE_IN_PROGRESS",
...
}
...
You can see that the auto scaling group started initiating at 05:33:33. At 05:36:42 (3 minutes after initiation), it received a success signal. This allowed the auto scaling group to reach its own success status only moments after, at 05:36:45.
That's awesome - working like a charm.
Now let's try increasing the number of instances in this auto scaling group to 2 by updating the stack:
$ aws cloudformation update-stack \
--region=ap-northeast-1 \
--stack-name=asg-test \
--template-body=file://template.yml \
--parameters ParameterKey=DesiredCapacity,ParameterValue=2
After waiting a much shorter time for the update to complete, let's look at some of the new stack events:
$ aws cloudformation describe-stack-events \
--region=ap-northeast-1 \
--stack-name=asg-test
{
"ResourceStatus": "UPDATE_COMPLETE",
...
"ResourceType": "AWS::CloudFormation::Stack",
...
"Timestamp": "2017-02-03T05:45:47.063Z"
},
...
{
"ResourceStatus": "UPDATE_COMPLETE",
...
"LogicalResourceId": "AutoScalingGroup",
"Timestamp": "2017-02-03T05:45:43.047Z"
},
{
"ResourceStatus": "UPDATE_IN_PROGRESS",
...,
"LogicalResourceId": "AutoScalingGroup",
"Timestamp": "2017-02-03T05:44:20.845Z"
},
{
"ResourceStatus": "UPDATE_IN_PROGRESS",
...
"ResourceType": "AWS::CloudFormation::Stack",
...
"Timestamp": "2017-02-03T05:44:15.671Z",
"ResourceStatusReason": "User Initiated"
},
....
Now you can see that whilst the auto scaling group started updating at 05:44:20, it completed at 05:45:43 - that's less than one and a half minutes to completion, which shouldn't be possible considering a sleep time of 120 seconds in the user data.
The stack update then proceeds to completion without the auto scaling group ever having received any signals.
The new instance does indeed exist.
In my real use case I've SSHed into one of these new instances to find that it was still in the process of initialising even after the stack update completed.
I've read and re-read the documentation surrounding CreationPolicy
and UpdatePolicy
, but have failed to identify what I'm missing.
Taking a look at the update policy in use above, I don't understand what it's actually doing. Why is WaitOnResourceSignals
true, but it's not waiting? Is it serving some other purpose?
Or are these new instances not falling under the "rolling update" policy? If they don't belong there, then I'd expect them to fall under the creation policy, but that doesn't seem to apply either.
As such, I don't really know what else to try.
I have a sneaking feeling that it's functioning as designed/expected, but if it is then what's the point of that WaitOnResourceSignals
property and how can I meet the expectation set above?
To update a AWS CloudFormation stack (console)In the AWS CloudFormation console , from the list of stacks, select the running stack that you want to update. In the stack details pane, choose Update. If you haven't modified the stack template, select Use current template, and then choose Next.
You can use the AutoScalingRollingUpdate policy to control how AWS CloudFormation handles rolling updates for an Auto Scaling group. This common approach keeps the same Auto Scaling group, and then replaces the old instances based on the parameters that you set.
When you update a stack, you submit changes, such as new input parameter values or an updated template. AWS CloudFormation compares the changes you submit with the current state of your stack and updates only the changed resources.
The AutoScalingRollingUpdate
policy handles rotating out an entire set of instances in an Auto Scaling group in response to changes to the underlying LaunchConfiguration
. It doesn't apply to individual changes to the number of instances in the existing group. According to the UpdatePolicy Attribute documentation,
The
AutoScalingReplacingUpdate
andAutoScalingRollingUpdate
policies apply only when you do one or more of the following:
- Change the Auto Scaling group's
AWS::AutoScaling::LaunchConfiguration
.- Change the Auto Scaling group's
VPCZoneIdentifier
property- Update an Auto Scaling group that contains instances that don't match the current
LaunchConfiguration
.
Changing the Auto Scaling group's DesiredCapacity
property is not in this list, so the AutoScalingRollingUpdate
policy does not apply to this type of change.
As far as I know, it is not possible (using standard AWS CloudFormation resources) to delay the completion of a Stack Update modifying DesiredCapacity
until any new instances added to the Auto Scaling Group are fully provisioned.
Here are some alternative options:
DesiredCapacity
, modify a LaunchConfiguration
property at the same time. This will trigger an AutoScalingRollingUpdate
to the desired capacity (the downside is that it will also update existing instances, which may not actually need to be modified).AWS::AutoScaling::LifecycleHook
resource to your Auto Scaling Group, and call aws autoscaling complete-lifecycle-action
in addition to cfn-signal
, to signal lifecycle-hook completion. This won't delay your CloudFormation stack update as desired, but it will delay the individual auto-scaled instances from entering the InService
state until the lifecycle signal is received. (See Lifecycle Hooks documentation for more info.)DesiredCapacity
number of instances all in the InService
state.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With