I am trying to execute query on Athena using python.
Sample code
client = boto3.client(
'athena',
region_name=region,
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY
)
execution = client.start_query_execution(
QueryString=query,
QueryExecutionContext={
'Database': database
},
WorkGroup=workgroup,
ResultConfiguration={
'OutputLocation': S3_OUTPUT_LOCATION
}
)
This is working code, But I got an unusual scenario.
InvalidRequestException: An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: Unable to verify/create output bucket <BUCKET NAME>
We observed this scenario for a few days, Every 24 hours python script throws the error then we execute the query on the Athena console(Query editor) and re-run the python script. I don't understand why it is happening, is there any permission issue.
Permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"athena:GetWorkGroup",
"athena:StartQueryExecution",
"athena:ListDatabases",
"athena:StopQueryExecution",
"athena:GetQueryExecution",
"athena:GetQueryResults",
"athena:GetDatabase",
"athena:GetDataCatalog",
"athena:ListQueryExecutions",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::<BUCKET NAME>",
"arn:aws:s3:::<BUCKET NAME>/*",
]
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:ListBucket",
"athena:UpdateWorkGroup",
],
"Resource": [
"arn:aws:s3:::<BUCKET NAME>/*",
"arn:aws:s3:::<BUCKET NAME>",
"arn:aws:athena:*:<BUCKET NAME>/<PATH>",
]
},
{
"Sid": "VisualEditor2",
"Effect": "Allow",
"Action": [
"athena:ListDataCatalogs",
"s3:ListAllMyBuckets"
],
"Resource": "*"
}
]
}
When I run Amazon Athena queries in SQL Workbench/J, in AWS Lambda, or with an AWS SDK, I get the error: "Unable to verify/create output bucket." Here are some common causes of this error: The Amazon Simple Storage Service (Amazon S3) bucket that you specified for the query result location doesn't exist.
The Amazon Simple Storage Service (Amazon S3) bucket that you specified for the query result location doesn't exist. The AWS Identity and Access Management (IAM) policy for the user or role that runs the query doesn't have the required Amazon S3 permissions, such as s3:GetBucketLocation.
This error can occur when you query an Amazon S3 bucket prefix that has a large number of objects. For more information, see How do I resolve the "HIVE_CANNOT_OPEN_SPLIT: Error opening Hive split s3://awsdoc-example-bucket/: Slow Down" error in Athena? in the AWS Knowledge Center.
This error usually occurs when a file is removed when a query is running. Either rerun the query, or check your workflow to see if another job or process is modifying the files when the query is running. This error can occur when you query an Amazon S3 bucket prefix that has a large number of objects.
I also faced same error today and found that execution role requires s3:GetBucketLocation permission also, AWS doc: https://aws.amazon.com/premiumsupport/knowledge-center/athena-output-bucket-error/
I was experiencing the same issue - random failures. The issue turned out to be s3:GetBucketLocation
policy being configured wrong. It was bundled with the same cluster as other s3 actions where the resource points to the s3 bucket, including path. It does not work this way.
I fixed it as below, works now.
- Effect: Allow
Action:
- s3:GetBucketLocation
Resource:
- arn:aws:s3:::*
- Effect: Allow
Action:
- s3:PutObject
- s3:GetObject
Resource:
- arn:aws:s3:::<BUCKET NAME>/<PATH>/*
See documentation: https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-with-s3-actions.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With