Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

StartQueryExecution operation: Unable to verify/create output bucket

I am trying to execute query on Athena using python.

Sample code

   client = boto3.client(
        'athena', 
        region_name=region,
        aws_access_key_id=AWS_ACCESS_KEY_ID,
        aws_secret_access_key=AWS_SECRET_ACCESS_KEY
    )
    execution = client.start_query_execution(
        QueryString=query,
        QueryExecutionContext={
            'Database': database
        },
        WorkGroup=workgroup,
        ResultConfiguration={
            'OutputLocation': S3_OUTPUT_LOCATION
        }
    )

This is working code, But I got an unusual scenario.

  • One day it throws an InvalidRequestException error Error
InvalidRequestException: An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: Unable to verify/create output bucket <BUCKET NAME>
  • As per the DevOps application have all the permission, It should work.
  • We try to execute the same query on the AWS Athena console(Query editor). There it is working.
  • Then we re-run the python script, it is not throwing any error.
  • But on the next day, the python script start's throwing the same InvalidRequestException error.
  • Then we execute the same query on the AWS Athena console(Query editor) and re-run the python script, it started working.

We observed this scenario for a few days, Every 24 hours python script throws the error then we execute the query on the Athena console(Query editor) and re-run the python script. I don't understand why it is happening, is there any permission issue.

Permissions:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "athena:GetWorkGroup",
                "athena:StartQueryExecution",
                "athena:ListDatabases",
                "athena:StopQueryExecution",
                "athena:GetQueryExecution",
                "athena:GetQueryResults",
                "athena:GetDatabase",
                "athena:GetDataCatalog",
                "athena:ListQueryExecutions",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::<BUCKET NAME>",
                "arn:aws:s3:::<BUCKET NAME>/*",
            ]
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:ListBucket",
                "athena:UpdateWorkGroup",
            ],
            "Resource": [
                "arn:aws:s3:::<BUCKET NAME>/*",
                "arn:aws:s3:::<BUCKET NAME>",
                "arn:aws:athena:*:<BUCKET NAME>/<PATH>",
            ]
        },
        {
            "Sid": "VisualEditor2",
            "Effect": "Allow",
            "Action": [
                "athena:ListDataCatalogs",
                "s3:ListAllMyBuckets"
            ],
            "Resource": "*"
        }
    ]
}
like image 890
NHD Avatar asked Feb 24 '21 10:02

NHD


People also ask

Why am I getting the error “unable to verify/create output bucket”?

When I run Amazon Athena queries in SQL Workbench/J, in AWS Lambda, or with an AWS SDK, I get the error: "Unable to verify/create output bucket." Here are some common causes of this error: The Amazon Simple Storage Service (Amazon S3) bucket that you specified for the query result location doesn't exist.

Why can't I find the S3 bucket I specified for query results?

The Amazon Simple Storage Service (Amazon S3) bucket that you specified for the query result location doesn't exist. The AWS Identity and Access Management (IAM) policy for the user or role that runs the query doesn't have the required Amazon S3 permissions, such as s3:GetBucketLocation.

Why is my S3 bucket not opening in hive?

This error can occur when you query an Amazon S3 bucket prefix that has a large number of objects. For more information, see How do I resolve the "HIVE_CANNOT_OPEN_SPLIT: Error opening Hive split s3://awsdoc-example-bucket/: Slow Down" error in Athena? in the AWS Knowledge Center.

Why am I getting a file removed error when running a query?

This error usually occurs when a file is removed when a query is running. Either rerun the query, or check your workflow to see if another job or process is modifying the files when the query is running. This error can occur when you query an Amazon S3 bucket prefix that has a large number of objects.


2 Answers

I also faced same error today and found that execution role requires s3:GetBucketLocation permission also, AWS doc: https://aws.amazon.com/premiumsupport/knowledge-center/athena-output-bucket-error/

like image 142
Hansraj Das Avatar answered Oct 17 '22 13:10

Hansraj Das


I was experiencing the same issue - random failures. The issue turned out to be s3:GetBucketLocation policy being configured wrong. It was bundled with the same cluster as other s3 actions where the resource points to the s3 bucket, including path. It does not work this way.

I fixed it as below, works now.

- Effect: Allow
  Action:
    - s3:GetBucketLocation
  Resource:
    - arn:aws:s3:::*
- Effect: Allow
  Action:
    - s3:PutObject
    - s3:GetObject
  Resource:
    - arn:aws:s3:::<BUCKET NAME>/<PATH>/*

See documentation: https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-with-s3-actions.html

like image 22
Hannes R. Avatar answered Oct 17 '22 14:10

Hannes R.