Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Looking for a good way to monitor ECS deploy failure notification?

I am deploying services to ECS fargate behind ALB. During deployment, ALB sends health check to the service and if there are 3 consecutive failure health checks, ECS will destroy the new deployed service and keep the old version of the container. I am looking for a way to monitor the deployment failure cases. One possible solution is to monitor the ECS task status change. Send an alert if the container status becomes STOP. But this solution is not specific to deployment. The container can become STOPPED anytime if there is an error. Also during deployment, the old container's status will become STOPPED as well. So is there any other metrics I can use to monitor the deployment failure?

like image 556
Joey Yi Zhao Avatar asked Dec 24 '19 02:12

Joey Yi Zhao


3 Answers

I think the accepted answer may be a bit dated. the AWS CLI has a command specifically designed to be used to ensure that a recently deployed ECS service was deployed successfully.

aws ecs wait services-stable

The above command will poll every 15 seconds until a successful state is reached. It will exit with a 255 error code after 40 failed checks.

https://docs.aws.amazon.com/cli/latest/reference/ecs/wait/services-stable.html

additionally, you can use Amazon EventBridge to respond to ECS events (container instance state change events, task state change events, and service action events). There are a bunch of useful triggers: CloudWatch Logs, Lambda, EC2 Run Command, Kinesis, Step Functions, and SNS topics or SQS queues.

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/cloudwatch_event_stream.html

like image 129
Phil Ninan Avatar answered Oct 04 '22 01:10

Phil Ninan


Now you can use the deployment circuit breaker, it was released in Nov 2020:

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/deployment-type-ecs.html

Then you can use CloudWatch to get states changes and trigger a lambda function:

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs_cwet.html

Then you can send it as a notification to Slack when "eventName": "SERVICE_DEPLOYMENT_FAILED" for example:

https://gist.github.com/KensoDev/d9f5ea978b16bac06463c6c78191f220

like image 43
dasilvadaniel Avatar answered Oct 04 '22 02:10

dasilvadaniel


Normally, we will integrate the deployment checking at the end of our CI/CD systems.

I am not sure about which CI tool you are using, but if you used Jenkins, you can do that on the post stage.

And after you update the ECS Service, there is a Deployments label on the ECS Service console, you can check there until the ACTIVE row disappeared. That means the new task has been deployed. It also works on the aws-cli, so you can use aws-cli and jq to run a simple loop to check if your new task deployed.

I have a sample script below can be a reference

 #!/bin/bash

 RESULT=$(aws ecs describe-services --cluster ${ECS_CLUSTER} --service ${SERVICE_NAME} \
   | jq -r '.services[].deployments[] | select(.status == "ACTIVE")')

 # No ACTIVE status means deployment complete
 if [ "$RESULT" = "" ]; then
   exit 0
 else
   echo "$RESULT"
   exit 1
 fi

Hopes it help you.

like image 21
David Hsu Avatar answered Oct 04 '22 03:10

David Hsu