I have an ECS Fargate cluster with a service that generates tasks based on how many messages are in a queue. Each task does long polling on the queue and processes one message at a time. If the queue gets over 5 messages a new task is spun up and it starts taking messages. When the queue then drops below 5 messages from the previous threshold, it shuts down a task. My question is, when the service decides to scale down, how does it know what task to kill? All tasks could be processing a message. Each task continuously runs and does long pulling from SQS, so how would it know that a task is in valid shut down state (just completed a message) and a non-valid shut down state (currently processing a message).

There is an open issue for improving termination of tasks as people have same concerns as you: <ul> <li>[ECS] [request]: Control which containers are terminated on scale in #125</li> </ul> From the issue, you can minimize the impact of termination of tasks on your processes by using stopTimeout. The parameter is: <blockquote> Time duration (in seconds) to wait before the container is forcefully killed if it doesn't exit normally on its own. </blockquote> But there is also a new feature for ESC: <ul> <li>Amazon ECS Introduces Enhanced Container Dependency Management</li> </ul> With this you can setup: <blockquote> dependencies for container startup and shutdown as well as a per-container start and stop timeout value. </blockquote> So generally there is no full control over termination of tasks, like you have in termination of instances in AutoScaling Group. But the things are getting worked on.

How does ECS Fargate auto scaling policy know not to kill a working task?

Tags:

amazon-web-services

amazon-ecs

aws-fargate

I have an ECS Fargate cluster with a service that generates tasks based on how many messages are in a queue. Each task does long polling on the queue and processes one message at a time. If the queue gets over 5 messages a new task is spun up and it starts taking messages. When the queue then drops below 5 messages from the previous threshold, it shuts down a task.

My question is, when the service decides to scale down, how does it know what task to kill? All tasks could be processing a message. Each task continuously runs and does long pulling from SQS, so how would it know that a task is in valid shut down state (just completed a message) and a non-valid shut down state (currently processing a message).

709

asked Apr 23 '20 16:04

Gary Holiday

1 Answers

There is an open issue for improving termination of tasks as people have same concerns as you:

[ECS] [request]: Control which containers are terminated on scale in #125

From the issue, you can minimize the impact of termination of tasks on your processes by using stopTimeout. The parameter is:

Time duration (in seconds) to wait before the container is forcefully killed if it doesn't exit normally on its own.

But there is also a new feature for ESC:

Amazon ECS Introduces Enhanced Container Dependency Management

With this you can setup:

dependencies for container startup and shutdown as well as a per-container start and stop timeout value.

So generally there is no full control over termination of tasks, like you have in termination of instances in AutoScaling Group. But the things are getting worked on.

answered Oct 04 '22 01:10

Marcin

Related questions
                            
                                Call AWS Lambda function from React-Native
                            
                                AWS Kinesis .NET Consumer
                            
                                AWS MySQL RDS fail over - replication lag handling?
                            
                                AWS API GATEWAY - empty response body
                            
                                How to get temporal credentials after auth with AWS ALB/Cognito/OIDC IdProvider?
                            
                                caching go modules in codebuild without custom docker image
                            
                                Should the infrastructure code be stored in the same repository as the application code?
                            
                                Aurora Serverless password rotation setup using CloudFormation (and Lambda rotation templates)
                            
                                AWS Route 53 with AWS API Gateway
                            
                                How do I point ELB to domain defined by service discovery
                            
                                AWS Cognito TOKEN Endpoint giving a 400 Bad Request error "unauthorized_client"
                            
                                AWS RDS Certificate Authority update
                            
                                How to upload a file from Postman using AWS S3 signed url?
                            
                                AWS Amplify GraphQL Schema that is linked to Cognito User Pool
                            
                                How can I validate an AWS cron expression in Java 8 without creating the AWS resource?
                            
                                How to disable introspection queries with AWS appsync (GraphQL)?
                            
                                How do you add KeyManager to a kms key mocked using moto
                            
                                How can one return binary content via AWS Lambda through API Gateway and CloudFront using AWS_PROXY mode?
                            
                                AWS RDS Disk Space used percentage
                            
                                AWS-serverless-express never resolving with promises

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With