Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS Glue Access denied for crawler with administrator policy attached

I am trying to run a crawler across an s3 datastore in my account which contains two csv files. However, when I try to run the crawler, no tables are loaded, and I see the following errors in cloudwatch for the each of the files:

  • Error Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied;
  • Tables created did not infer schemas from this file.

This is especially odd as the IAM role has the AdministratorAccess policy attached, so there should not be any access denied issue.

Any help would be appreciated.

like image 274
Jon Swanson Avatar asked Aug 17 '18 16:08

Jon Swanson


People also ask

Is crawler is mandatory in AWS Glue?

No. you don't need to create a crawler to run Glue Job. Crawler can read multiple datasources and keep Glue Catalog up to date.

Could not find S3 endpoint or NAT gateway for subnetId AWS Glue?

Error: Could not find S3 endpoint or NAT gateway for subnetId in VPC. Check the subnet ID and VPC ID in the message to help you diagnose the issue. Check that you have an Amazon S3 VPC endpoint set up, which is required with AWS Glue. In addition, check your NAT gateway if that's part of your configuration.


1 Answers

Check to see if the files you are crawling are encrypted. If they are, then your Glue role probably doesn't have a policy that allows it to decrypt.

If so, it might need something like this:

{
  "Version": "2012-10-17",
  "Statement": {
    "Effect": "Allow",
    "Action": [
      "kms:Decrypt"
    ],
    "Resource": [
      "arn:aws:kms:us-west-2:111122223333:key/1234abcd-12ab-34cd-56ef-1234567890ab",
      "arn:aws:kms:us-west-2:111122223333:key/0987dcba-09fe-87dc-65ba-ab0987654321"
    ]
  }
}
like image 51
Andy Zoutte Avatar answered Nov 12 '22 21:11

Andy Zoutte