Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS Batch CLIENT_ERROR Invalid IamInstanceProfile

Originally posted this to ServerFault, but posting here in the hopes that someone might have run into my issue.

I'm trying to set up a container to run on AWS Batch. I'm not doing anything fancy, more or less just following the default set-up with everything. I'm getting an error that seems to be related to the instance role or the permissions associated with the instance role.

The set-up goes without a hitch at first. I set up my compute environment, then my queue, then I add a basic job to the queue. The job ends up getting stuck in the runnable state, and then after 20 minutes or so, my compute environment becomes "INVALID" with this error:

CLIENT_ERROR - Invalid IamInstanceProfile: arn:aws:iam::001234567890:role/ecsInstanceRole (Service: AmazonAutoScaling; Status Code: 400; Error Code: ValidationError; Request ID: blah)

I read this troubleshooting guide, which seems to tackle related problems (though they aren't quite exact matches). I've tried recreating the environment 5 or 6 times with no luck. I've also tried deleting my existing roles and letting the manager recreate them. Most of the problems in the troubleshooting guide seem to stem from roles that were incorrectly set up in the AWS CLI or via some non-Batch console needs. The guide even reads "the AWS Batch console only displays roles that have the correct trust relationship for compute environments". But all of the roles I've used I've selected via the console, which would seem to imply that they're correctly permissioned.

Not sure what to do here, grateful for any help.

like image 422
Alex Alifimoff Avatar asked Sep 17 '17 14:09

Alex Alifimoff


2 Answers

Somewhat confusingly, the instanceRole property of AWS Batch Compute Environment must reference IAM instance profile ARN rather than IAM role ARN. That is, the instanceRole value should look like arn:aws:iam::123456789012:instance-profile/ecsInstanceRole rather than arn:aws:iam::123456789012:role/ecsInstanceRole. The error message actually mentions instance profiles, though.

The following CloudFormation snippet creates a valid Batch compute environment:

Parameters:
    VPC:
        Type: String
        Description: VPC ID of the target VPC
    Subnet:
        Type: List<AWS::EC2::Subnet::Id>
        Description: VPC subnet(s) for batch instances
    SG:
        Type: List<AWS::EC2::SecurityGroup::Id>
        Description: VPC Security group ID(s) for batch instances

Resources:
    MyBatchEnvironment:
        Type: "AWS::Batch::ComputeEnvironment"
        Properties:
            Type: MANAGED
            ServiceRole: !GetAtt MyBatchEnvironmentRole.Arn
            ComputeResources:
                MaxvCpus: 8
                SecurityGroupIds: !Ref SG
                Subnets: !Ref Subnet
                InstanceRole: !GetAtt MyBatchInstanceProfile.Arn
                MinvCpus: 0
                DesiredvCpus: 0
                Type: EC2
                InstanceTypes:
                    - optimal

    MyBatchEnvironmentRole:
        Type: "AWS::IAM::Role"
        Properties:
            AssumeRolePolicyDocument:
                Version: '2012-10-17'
                Statement:
                    - Effect: Allow
                      Principal: {Service: "batch.amazonaws.com"}
                      Action: "sts:AssumeRole"
            Path: /service-role/
            ManagedPolicyArns:
                - "arn:aws:iam::aws:policy/service-role/AWSBatchServiceRole"

    MyBatchInstanceRole:
        Type: "AWS::IAM::Role"
        Properties:
            AssumeRolePolicyDocument:
                Version: '2012-10-17'
                Statement:
                    - Effect: Allow
                      Principal: {Service: "ec2.amazonaws.com"}
                      Action: "sts:AssumeRole"
            Path: /
            ManagedPolicyArns:
                - "arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceforEC2Role"

    MyBatchInstanceProfile:
        Type: "AWS::IAM::InstanceProfile"
        Properties:
            Path: "/"
            Roles:
                - !Ref MyBatchInstanceRole
like image 101
Alex Grigorovitch Avatar answered Oct 20 '22 16:10

Alex Grigorovitch


Thank you for bringing this to our attention. We have resolved the root cause of this issue and the console should now work as expected. Please give this another try and let us know if you encounter any further errors.

Jamie from the AWS Batch team

like image 43
Jamie Kinney Avatar answered Oct 20 '22 16:10

Jamie Kinney