Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CannotPullContainerError: context canceled error when starting ECS task

I am starting an ECS task with Fargate and the container ends up in a STOPPED state after being in PENDING for a few minutes. The Status gives the following error message:

CannotPullContainerError: context canceled

I am using PrivateLink to allow the ECS host to talk to the ECR registry without having to go via the public Internet and this is how it is configured (Serverless syntax augmenting CloudFormation):

      Properties:
        PrivateDnsEnabled: true
        ServiceName: com.amazonaws.ap-southeast-2.ecr.dkr
        SubnetIds:
          - { Ref: frontendSubnet1 }
          - { Ref: frontendSubnet2 }
        VpcEndpointType: Interface
        VpcId: { Ref: frontendVpc }

Any ideas as to what is causing the error?

like image 699
tschumann Avatar asked Nov 04 '19 06:11

tschumann


People also ask

What is AWS ECS cannotpullcontainererror and how to fix it?

We will keep your servers stable, secure, and fast at all times for one fixed price. AWS ECS Cannotpullcontainererror occurs when a task fails to pull an image due to an incorrectly configured network or an intermittent connection. Here, at Bobcares, we assist our customers with several AWS queries as part of our AWS Support Services.

Why can't I start an Amazon ECS task on Fargate?

The "cannotpullcontainererror" error can cause tasks not to start. To start an Amazon ECS task on Fargate, your Amazon Virtual Private Cloud (Amazon VPC) networking configurations must allow your Amazon ECS infrastructure to access the repository where the image is stored.

Why am I getting an error when launching a task?

If you receive an error similar to the following when launching a task, it's because a route to the internet doesn't exist: To resolve this issue, you can: For tasks in public subnets, specify ENABLED for Auto-assign public IP when launching the task. For more information, see Run a standalone task .

Why can't I pull the container image from ECR?

The common cause for this error is because the VPC your task is using doesn't have a route to pull the container image from Amazon ECR. When you specify an Amazon ECR image in your container definition, you must use the full URI of your ECR repository along with the image name in that repository.


2 Answers

did you also add an S3 endpoint? Here is a working snippet of my template, I was able to solve the issue with the aws support:

  EcrDkrEndpoint:
Type: 'AWS::EC2::VPCEndpoint'
Properties:
  PrivateDnsEnabled: true
  SecurityGroupIds: [!Ref 'FargateContainerSecurityGroup']
  ServiceName: !Sub 'com.amazonaws.${AWS::Region}.ecr.dkr'
  SubnetIds: [!Ref 'PrivateSubnetOne', !Ref 'PrivateSubnetTwo']
  VpcEndpointType: Interface
  VpcId: !Ref 'VPC'

For S3 you need to know that a route table is necessary - normally you would like to use the same as for the internet gateway, containing the route 0.0.0.0/0

  S3Endpoint:
Type: 'AWS::EC2::VPCEndpoint'
Properties:
  ServiceName: !Sub 'com.amazonaws.${AWS::Region}.s3'
  VpcEndpointType: Gateway
  VpcId: !Ref 'VPC'
  RouteTableIds: [!Ref 'PrivateRouteTable'] 

Without an endpoint for cloudwatch you will get another failure, it is necessary too:

  CloudWatchEndpoint:
Type: 'AWS::EC2::VPCEndpoint'
Properties:
  PrivateDnsEnabled: true
  SecurityGroupIds: [!Ref 'FargateContainerSecurityGroup']
  ServiceName: !Sub 'com.amazonaws.${AWS::Region}.logs'
  SubnetIds: [!Ref 'PrivateSubnetOne', !Ref 'PrivateSubnetTwo']
  VpcEndpointType: Interface
  VpcId: !Ref 'VPC'

EDIT: private route table:

  PrivateRoute:
Type: AWS::EC2::Route
DependsOn: InternetGatewayAttachement
Properties:
  RouteTableId: !Ref 'PublicRouteTable'
  DestinationCidrBlock: '0.0.0.0/0'
  GatewayId: !Ref 'InternetGateway'
like image 124
graphik_ Avatar answered Sep 19 '22 14:09

graphik_


I found I needed not only vpc endpoints for s3, aws logs and the two ecr endpoints as detailed in @graphik_ 's answer but I also needed to ensure that the security groups on the endpoints allowed ingress access to HTTPS from the security group on the Farscape containers.

The security group on the Farscape containers need egress access via HTTPS to the vpce endpoint security group and also to the pl-7ba54012 IP group which is s3.

This and the route to pl-7ba54012 in the route table seems to be the whole picture.

There are Policies on the vpce too, which I left as the default "All Access" but you could harden these up to only allow access from the Role running the Fargate containers.

like image 22
M. Day Avatar answered Sep 21 '22 14:09

M. Day