I am trying to run a private repository on aws-ecs-fargate-1.4.0 platform.
For private repository authentication, I have followed the docs and it was working well.
Somehow after updating existing service many times it goes fail to run the task and complain the error like
ResourceInitializationError: unable to pull secrets or registry auth: execution resource retrieval failed: unable to get registry auth from asm: service call has been retried 1 time(s): asm fetching secret from the service for <secretname>: RequestError: ...
I haven't change the ecsTaskExecutionRole and it contains all required policies to fetch secret value.
AWS employee here.
What you are seeing is due to a change in how networking works between Fargate platform version 1.3.0, and Fargate platform version 1.4.0. As part of the change from using Docker to using containerd we also made some changes to how networking works. In version 1.3.0 and below each Fargate task got two network interfaces:
This secondary network interface had some downsides though. This secondary traffic did not show up in your VPC flow logs. Also while most traffic stayed in the customer VPC, the secondary network interface was sending traffic outside of your VPC. A number of customers complained that they did not have the ability to specify network level controls on this secondary network interface and what it was able to connect to.
To make the networking model less confusing and give customers more control, we changed in Fargate platform version 1.4.0 to using a single network interface and keeping all traffic inside of your VPC, even the Fargate platform traffic. The Fargate platform traffic for fetching ECR authentication and task secrets now uses the same task network interface as the rest of your task traffic, and you can observe this traffic in VPC flow logs, and control this traffic using the routing table in your own AWS VPC.
However, with this increased ability to observe and control the Fargate platform networking, you also become responsible for ensuring that there is actually a network path configured in your VPC that allows the task to communicate with ECR and AWS Secrets Manager.
There are a few ways to solve this:
You can read more about this change in this official blogpost, under the section "Task elastic network interface (ENI) now runs additional traffic flows"
https://aws.amazon.com/blogs/containers/aws-fargate-launches-platform-version-1-4/
I'm not completely sure about your setup but after I disabled the NAT-Gateways to save some $, I had a very similar error message on the aws-ecs-fargate-1.4.0 platform:
Stopped reason: ResourceInitializationError: unable to pull secrets or registry auth: execution resource retrieval failed: unable to retrieve ecr registry auth: service call has been retried 1 time(s): RequestError: send request failed caused by: Post https://api.ecr....
It turned out that I had to create VPC Endpoints to these Service names:
And I had to downgrade to the aws-ecs-fargate-1.3.0 platform. After the downgrade the Docker images could be pulled from ECR and the deployments succeeded again.
If you are using the secret manager without a NAT-Gateway, it might be that you have to create a VPC Endpoint for com.amazonaws.REGION.secretsmanager. 
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With