Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to keep the last X ECS task definitions active?

I have the following Terraform code to update a service with a new task definition:

resource "aws_ecs_task_definition" "app_definition" {
  family = "my-family"

  container_definitions = "${data.template_file.task_definition.rendered}"
  network_mode          = "bridge"
}

resource "aws_ecs_service" "app_service" {
  name            = "my-service"
  cluster         = "my-cluster"
  task_definition = "${aws_ecs_task_definition.app_definition.arn}"
  desired_count   = "1"
  iam_role        = "my-iam-role"
}

When updating my service, the last revision of my task definition becomes inactive. As a result, I can not select it when trying to manually roll back to a previous revision in the ECS console:

Error: No active task definition found

Ideally, I want to keep the last X revisions active so I can always manually roll back via the console if something goes wrong.

How can I achieve that?

like image 632
bitbrain Avatar asked Aug 10 '18 07:08

bitbrain


People also ask

Why is task definition inactive?

Upon deregistration, the task definition is marked as INACTIVE . Existing tasks and services that reference an INACTIVE task definition continue to run without disruption. Existing services that reference an INACTIVE task definition can still scale up or down by modifying the service's desired count.

Can an ECS service have multiple task definitions?

Your application can span multiple task definitions. You can do this by combining related containers into their own task definitions, each representing a single component.

Why does my ECS Task keep stopping?

Your Amazon ECS tasks might stop due to a variety of reasons. The most common reasons are: Essential container exited. Failed Elastic Load Balancing (ELB) health checks.

How does ECS allocate memory to tasks?

In Amazon ECS, memory can be defined at both the task level and at each container level. Memory defined at the task level is the hard limit of memory for the task. At the container level, there are two parameters for allocating memory to tasks: memoryReservation (a soft limit) and memory (a hard limit).


1 Answers

Terraform doesn't currently allow for this and its resource lifecycle model means that when you replace something (task definitions are immutable) Terraform must create a new one and destroy the old one.

With ECS task definitions also can't really be destroyed and instead are just marked as inactive as there may be tasks currently deployed that are using it until they are updated by the service to the new task definition.

There's 2 common ways of dealing with this and the need to be able to roll back to a previous version of a task definition.

The first is simply not to use Terraform to manage the task definition beyond initial creation and use something like the AWS ECS CLI tool to do this instead.

The other option, and the one that I use, is to have my CI (Gitlab CI in our case) generate a Docker image tagged with the commit SHA of the application to be deployed and then Terraform updates the task definition to the new commit SHA tagged image on an apply as well as updating the ECS service with the new task definition ARN.

When we want to roll back we use our CI's ability to roll back to a different commit, launching just the deploy job with the old commit SHA and so deploying the old image.

This keeps Terraform pretty agnostic of what's being deployed and makes the CI system responsible for deploying the required version which is normally latest but sometimes a specific commit if we have a manual click to deploy and of course the target previous version when rolling back.

It does mean that you can't launch roll backs through the AWS console but I actually like this as I want the CI system to be the source of truth for what is deployed at any time.

like image 62
ydaetskcoR Avatar answered Sep 28 '22 01:09

ydaetskcoR