Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

`EMR service role is invalid` when Creating EMR Cluster

After creating the Amazon S3 Bucket, my_bucket, I created an Elastic Map Reduce cluster via the cli:

aws emr create-cluster --name "Hive testing" --ami-version 3.3 --applications Name=Hive --use-default-roles --instance-type m3.xlarge --instance-count 3 --steps Type=Hive,Name="Hive Program",Args=[-d,INPUT=s3://my_bucket/input,-d.OUTPUT=s3://my_bucket/input,-d-LIBS=s3://my_bucket/serde_libs]

Note that I did not specify a hive *.q file. After making the S3 and EMR Cluster, I will log onto the EMR box, and then run hive interactively.

Note- I'm assuming there's an EMR box onto which I can log.

However, when I ran aws emr describe-cluster --cluster-id XYZ, I saw this error in the output:

   "State": "TERMINATED_WITH_ERRORS", 
        "StateChangeReason": {
            "Message": "EMR service role arn:aws:iam::xyz:role/EMR_DefaultRole 
                         is invalid", 
            "Code": "VALIDATION_ERROR"
        }

What would cause this error? Do I need to open permissions on the S3 bucket for the EMR cluster to access it?

like image 282
Kevin Meredith Avatar asked Jan 14 '15 22:01

Kevin Meredith


1 Answers

The issue is not with the bucket but that the expected IAM role is missing.

See http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-iam-roles-creatingroles.html#emr-iam-roles-createdefaultwithcli

Issue the AWS CLI command:

aws emr create-default-roles 

Then create the cluster again. This is a one-time step needed to create the default roles.

  • note: beware of using a recent version of aws cli, I had problems with 1.4 (debian jessie package)

  • note 2: taken from mrjob code and amazon annoucments:

    instance profile and service role are required for accounts created after April 6, 2015, and will eventually be required for all accounts

like image 136
ChristopherB Avatar answered Nov 10 '22 22:11

ChristopherB