Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS Fargate ResourceInitializationError: unable to pull secrets or registry auth: pull command failed: : signal: killed

Slightly tearing my hair out with this one... I am trying to run a Docker image on Fargate in a VPC in a Public subnet. When I run this as a Task I get:

ResourceInitializationError: unable to pull secrets or registry auth: pull
command failed: : signal: killed

If I run the Task in a Private subnet, through a NAT, it works. It also works if I run it in a Public subnet of the default VPC.

I have checked through the advice here:

Aws ecs fargate ResourceInitializationError: unable to pull secrets or registry auth

In particular, I have security groups set up to allow all traffic. Also Network ACL set up to allow all traffic. I have even been quite liberal with the IAM permissions, in order to try and eliminate that as a possibility:

The task execution role has:

   {
        "Action": [
            "kms:*",
            "secretsmanager:*",
            "ssm:*",
            "s3:*",
            "ecr:*",
            "ecs:*",
            "ec2:*"
        ],
        "Resource": "*",
        "Effect": "Allow"
    }

With trust relationship to allow ecs-tasks to assume this role:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ecs-tasks.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

The security group is:

sg-093e79ca793d923ab All traffic All traffic All 0.0.0.0/0

And the Network ACL is:

Inbound
Rule number Type Protocol Port range Source Allow/Deny
100 All traffic All All 0.0.0.0/0    Allow
*   All traffic All All 0.0.0.0/0    Deny

Outbound
Rule number Type Protocol Port range Destination Allow/Deny
100 All traffic All All 0.0.0.0/0    Allow
*   All traffic All All 0.0.0.0/0    Deny

I set up flow logs on the subnet, and I can see that traffic is Accept Ok in both directions.

I do not have any Interface Endpoints set up to reach AWS services without going through the Internet Gateway.

I also have Public IP address assigned to the Fargate instance upon creation.

This should work, since the Public subnet should have access to all needed services through the Internet Gateway. It also works in the default VPC or a Private subnet.

Can anyone suggest what else I should check to debug this?

like image 266
user2800708 Avatar asked Apr 28 '21 13:04

user2800708


3 Answers

One of the potential problems for ResourceInitializationError: unable to pull secrets or registry auth: pull command failed: : signal: killed is disabled Auto-assign public IP. After I enabled it (recreating service from the scrath), task run properly without issues.

enter image description here

like image 130
valdem Avatar answered Nov 02 '22 19:11

valdem


For those unlucky souls, there is one more thing to check.

I already had an internet gateway in my VPC, DNS was enabled for that VPC, all containers were getting public IPs and the execution role already had access to ECR. But even so, I was still getting the same error.

Turns out the problem was about Routing Table. The routing table of my VPC didn't include a route for directing outbound traffic to internet gateway so my subnet had no internet access.

Adding the second line to the table that routes 0.0.0.0/0 traffic to internet gateway solved the issue.

enter image description here

like image 13
e-mre Avatar answered Nov 02 '22 17:11

e-mre


I was facing the same issue. But in my case, I was triggering the Fargate Container from the Lambda function using the RunTask operation. So In the RunTask operation, I was not passing the below parameter:

assignPublicIp: ENABLED

After adding this, Container was triggering without any issues.

like image 12
Gurudeepak Avatar answered Nov 02 '22 18:11

Gurudeepak