I am deploying services to ECS fargate behind ALB. During deployment, ALB sends health check to the service and if there are 3 consecutive failure health checks, ECS will destroy the new deployed service and keep the old version of the container. I am looking for a way to monitor the deployment failure cases. One possible solution is to monitor the ECS task status change. Send an alert if the container status becomes STOP. But this solution is not specific to deployment. The container can become STOPPED anytime if there is an error. Also during deployment, the old container's status will become STOPPED as well. So is there any other metrics I can use to monitor the deployment failure?
I think the accepted answer may be a bit dated. the AWS CLI has a command specifically designed to be used to ensure that a recently deployed ECS service was deployed successfully.
aws ecs wait services-stable
The above command will poll every 15 seconds until a successful state is reached. It will exit with a 255 error code after 40 failed checks.
https://docs.aws.amazon.com/cli/latest/reference/ecs/wait/services-stable.html
additionally, you can use Amazon EventBridge to respond to ECS events (container instance state change events, task state change events, and service action events). There are a bunch of useful triggers: CloudWatch Logs, Lambda, EC2 Run Command, Kinesis, Step Functions, and SNS topics or SQS queues.
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/cloudwatch_event_stream.html
Now you can use the deployment circuit breaker, it was released in Nov 2020:
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/deployment-type-ecs.html
Then you can use CloudWatch to get states changes and trigger a lambda function:
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs_cwet.html
Then you can send it as a notification to Slack when "eventName": "SERVICE_DEPLOYMENT_FAILED" for example:
https://gist.github.com/KensoDev/d9f5ea978b16bac06463c6c78191f220
Normally, we will integrate the deployment checking at the end of our CI/CD systems.
I am not sure about which CI tool you are using, but if you used Jenkins, you can do that on the post
stage.
And after you update the ECS Service, there is a Deployments
label on the ECS Service console, you can check there until the ACTIVE
row disappeared. That means the new task has been deployed. It also works on the aws-cli, so you can use aws-cli and jq
to run a simple loop to check if your new task deployed.
I have a sample script below can be a reference
#!/bin/bash
RESULT=$(aws ecs describe-services --cluster ${ECS_CLUSTER} --service ${SERVICE_NAME} \
| jq -r '.services[].deployments[] | select(.status == "ACTIVE")')
# No ACTIVE status means deployment complete
if [ "$RESULT" = "" ]; then
exit 0
else
echo "$RESULT"
exit 1
fi
Hopes it help you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With