I am starting an ECS task with Fargate and the container ends up in a STOPPED state after being in PENDING for a few minutes. The Status gives the following error message:
CannotPullContainerError: context canceled
I am using PrivateLink to allow the ECS host to talk to the ECR registry without having to go via the public Internet and this is how it is configured (Serverless syntax augmenting CloudFormation):
Properties:
PrivateDnsEnabled: true
ServiceName: com.amazonaws.ap-southeast-2.ecr.dkr
SubnetIds:
- { Ref: frontendSubnet1 }
- { Ref: frontendSubnet2 }
VpcEndpointType: Interface
VpcId: { Ref: frontendVpc }
Any ideas as to what is causing the error?
We will keep your servers stable, secure, and fast at all times for one fixed price. AWS ECS Cannotpullcontainererror occurs when a task fails to pull an image due to an incorrectly configured network or an intermittent connection. Here, at Bobcares, we assist our customers with several AWS queries as part of our AWS Support Services.
The "cannotpullcontainererror" error can cause tasks not to start. To start an Amazon ECS task on Fargate, your Amazon Virtual Private Cloud (Amazon VPC) networking configurations must allow your Amazon ECS infrastructure to access the repository where the image is stored.
If you receive an error similar to the following when launching a task, it's because a route to the internet doesn't exist: To resolve this issue, you can: For tasks in public subnets, specify ENABLED for Auto-assign public IP when launching the task. For more information, see Run a standalone task .
The common cause for this error is because the VPC your task is using doesn't have a route to pull the container image from Amazon ECR. When you specify an Amazon ECR image in your container definition, you must use the full URI of your ECR repository along with the image name in that repository.
did you also add an S3 endpoint? Here is a working snippet of my template, I was able to solve the issue with the aws support:
EcrDkrEndpoint:
Type: 'AWS::EC2::VPCEndpoint'
Properties:
PrivateDnsEnabled: true
SecurityGroupIds: [!Ref 'FargateContainerSecurityGroup']
ServiceName: !Sub 'com.amazonaws.${AWS::Region}.ecr.dkr'
SubnetIds: [!Ref 'PrivateSubnetOne', !Ref 'PrivateSubnetTwo']
VpcEndpointType: Interface
VpcId: !Ref 'VPC'
For S3 you need to know that a route table is necessary - normally you would like to use the same as for the internet gateway, containing the route 0.0.0.0/0
S3Endpoint:
Type: 'AWS::EC2::VPCEndpoint'
Properties:
ServiceName: !Sub 'com.amazonaws.${AWS::Region}.s3'
VpcEndpointType: Gateway
VpcId: !Ref 'VPC'
RouteTableIds: [!Ref 'PrivateRouteTable']
Without an endpoint for cloudwatch you will get another failure, it is necessary too:
CloudWatchEndpoint:
Type: 'AWS::EC2::VPCEndpoint'
Properties:
PrivateDnsEnabled: true
SecurityGroupIds: [!Ref 'FargateContainerSecurityGroup']
ServiceName: !Sub 'com.amazonaws.${AWS::Region}.logs'
SubnetIds: [!Ref 'PrivateSubnetOne', !Ref 'PrivateSubnetTwo']
VpcEndpointType: Interface
VpcId: !Ref 'VPC'
EDIT: private route table:
PrivateRoute:
Type: AWS::EC2::Route
DependsOn: InternetGatewayAttachement
Properties:
RouteTableId: !Ref 'PublicRouteTable'
DestinationCidrBlock: '0.0.0.0/0'
GatewayId: !Ref 'InternetGateway'
I found I needed not only vpc endpoints for s3, aws logs and the two ecr endpoints as detailed in @graphik_ 's answer but I also needed to ensure that the security groups on the endpoints allowed ingress access to HTTPS from the security group on the Farscape containers.
The security group on the Farscape containers need egress access via HTTPS to the vpce endpoint security group and also to the pl-7ba54012 IP group which is s3.
This and the route to pl-7ba54012 in the route table seems to be the whole picture.
There are Policies on the vpce too, which I left as the default "All Access" but you could harden these up to only allow access from the Role running the Fargate containers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With