I have an ECS Fargate cluster with a service that generates tasks based on how many messages are in a queue. Each task does long polling on the queue and processes one message at a time. If the queue gets over 5 messages a new task is spun up and it starts taking messages. When the queue then drops below 5 messages from the previous threshold, it shuts down a task.
My question is, when the service decides to scale down, how does it know what task to kill? All tasks could be processing a message. Each task continuously runs and does long pulling from SQS, so how would it know that a task is in valid shut down state (just completed a message) and a non-valid shut down state (currently processing a message).
Fargate supports auto-scaling, you can enable this within your configuration. You will need to set it to scale against a specific metric (such as average CPU or average network in).
Your Amazon ECS tasks might stop due to a variety of reasons. The most common reasons are: Essential container exited. Failed Elastic Load Balancing (ELB) health checks.
The ephemeral storage configuration depends on which platform version the task is using. After a Fargate task stops, the ephemeral storage is deleted. For more information about Amazon ECS default service limits, see Amazon ECS service quotas. The host and sourcePath parameters are not supported for Fargate tasks.
Automatic scaling is the ability to increase or decrease the desired count of tasks in your Amazon ECS service automatically. Amazon ECS leverages the Application Auto Scaling service to provide this functionality. For more information, see the Application Auto Scaling User Guide.
There is an open issue for improving termination of tasks as people have same concerns as you:
From the issue, you can minimize the impact of termination of tasks on your processes by using stopTimeout. The parameter is:
Time duration (in seconds) to wait before the container is forcefully killed if it doesn't exit normally on its own.
But there is also a new feature for ESC:
With this you can setup:
dependencies for container startup and shutdown as well as a per-container start and stop timeout value.
So generally there is no full control over termination of tasks, like you have in termination of instances in AutoScaling Group. But the things are getting worked on.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With