Cloudformation template for creating ECS service stuck in CREATE_IN_PROGRESS

Tags:

I am creating an AWS ECS service using Cloudformation.

Everything seems to complete successfully, I can see the instance being attached to the load-balancer, the load-balancer is declaring the instance as being healthy, and if I hit the load-balancer I am successfully taken to my running container.

Looking at the ECS control panel, I can see that the service has stabilised, and that everything is looking OK. I can also see that the container is stable, and is not being terminated/re-created.

However, the Cloudformation template never completes, it is stuck in CREATE_IN_PROGRESS until about 30-60 minutes later, when it rolls back claiming that the service did not stabilise. Looking at CloudTrail, I can see a number of RegisterInstancesWithLoadBalancer instantiated by ecs-service-scheduler, all with the same parameters i.e. same instance id and load-balancer. I am using standard IAM roles and permissions for ECS, so it should not be a permissions issue.

Anyone had a similar issue?

201

asked Sep 22 '15 21:09

Anvar

2 Answers

Your AWS::ECS::Service needs to register the full ARN for the TaskDefinition (Source: See the answer from ChrisB@AWS on the AWS forums). The key thing is to set your TaskDefinition with the full ARN, including revision. If you skip the revision (:123 in the example below), the latest revision is used, but CloudFormation still goes out to lunch with "CREATE_IN_PROGRESS" for about an hour before failing. Here's one way to do that:

"MyService": {     "Type": "AWS::ECS::Service",     "Properties": {         "Cluster": { "Ref": "ECSClusterArn" },         "DesiredCount": 1,         "LoadBalancers": [             {                 "ContainerName": "myContainer",                 "ContainerPort": "80",                 "LoadBalancerName": "MyELBName"             }         ],         "Role": { "Ref": "EcsElbServiceRoleArn" },         "TaskDefinition": {             "Fn::Join": ["", ["arn:aws:ecs:", { "Ref": "AWS::Region" },             ":", { "Ref": "AWS::AccountId" },             ":task-definition/my-task-definition-name:123"]]}         }     } }

Here's a nifty way to grab the latest revision of MyTaskDefinition via the aws cli and jq:

aws ecs list-task-definitions --family-prefix MyTaskDefinition | jq --raw-output .taskDefinitionArns[0][-1:]

answered Sep 28 '22 01:09

Pete

I found another related scenario that will cause this and thought I'd put it here in case anyone else runs into it. If you define a TaskDefinition with an Image that doesn't actually exist in its ContainerDefinition and then you try to run that TaskDefinition as a Service, you'll run into the same hang issue (or at least something that looks like the same issue).

NOTE: The example YAML chunks below were all in the same CloudFormation template

So as an example, I created this Repository:

MyRepository:     Type: AWS::ECR::Repository

And then I created this Cluster:

MyCluster:     Type: AWS::ECS::Cluster

And this TaskDefinition (abridged):

MyECSTaskDefinition:     Type: AWS::ECS::TaskDefinition     Properties:         # ...         ContainerDefinitions:             # ...               Image: !Join ["", [!Ref "AWS::AccountId", ".dkr.ecr.", !Ref "AWS::Region", ".amazonaws.com/", !Ref MyRepository, ":1"]]             # ...

With those defined, I went to create a Service like this:

MyECSServiceDefinition:     Type: AWS::ECS::Service     Properties:         Cluster: !Ref MyCluster         DesiredCount: 2         PlacementStrategies:             - Type: spread               Field: attribute:ecs.availability-zone         TaskDefinition: !Ref MyECSTaskDefinition

Which all seemed sensible to me, but it turns out there two issues with this as written/deployed that caused it to hang.

The DesiredCount is set to 2 which means it will actually try to spin up the service and run it, not just define it. If I set DesiredCount to 0, this works just fine.
The Image defined in MyECSTaskDefinition doesn't exist yet. I made the repository as part of this template, but I didn't actually push anything to it. So when the MyECSServiceDefinition tried to spin up the DesiredCount of 2 instances, it hung because the image wasn't actually available in the repository (because the repository literally just got created in the same template).

So, for now, the solution is to create the CloudFormation stack with a DesiredCount of 0 for the Service, upload the appropriate Image to the repository and then update the CloudFormation stack to scale up the service. Or alternately, have a separate template that sets up core infrastructure like the repository, upload builds to that and then have a separate template to run that sets up the Services themselves.

Hope that helps anyone having this issue!

answered Sep 28 '22 00:09

Brent Writes Code

Related questions
                            
                                how do i to forward domain.com to www.domain.com at godaddy for s3 hosted site?
                            
                                AWS API Gateway - CORS + POST not working
                            
                                Issue when trying to delete VPC and Network Interface
                            
                                Find the owner of an AWS Access Key
                            
                                Alexa Skills Kit trigger not available on drop down in AWS Lambda
                            
                                Cheapest way to deploy a React app using NextJS SSR on AWS? [closed]
                            
                                "head" command for aws s3 to view file contents
                            
                                Cloudwatch Log Alert - How to include error / exception / stack trace data in email notification
                            
                                ECS/ECR: is common practice to have one repository per image (and associated versions)?
                            
                                How to check if specific resource already exists in CloudFormation script
                            
                                What is the Difference between file_upload() and put_object() when uploading files to S3 using boto3
                            
                                AWS EBS Volume "in-use - optimizing"
                            
                                What is the purpose of 'Reservations' in Amazon EC2
                            
                                How to set an environment variable in Amazon EC2
                            
                                What is Sid attribute use for in key policies?
                            
                                How to upload to AWS S3 directly from browser using a pre-signed URL instead of credentials?
                            
                                upload a directory to s3 with boto
                            
                                AWS security group inbound rule. allow lambda function
                            
                                In Terraform, how do you specify an API Gateway endpoint with a variable in the request path?
                            
                                Where to find Identity Pool Id in Cognito

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Cloudformation template for creating ECS service stuck in CREATE_IN_PROGRESS

Tags:

amazon-web-services

amazon-ecs

amazon-cloudformation

Anvar

People also ask

2 Answers

Pete

Brent Writes Code

Recent Activity

Donate For Us