AWS Athena partition fetch all paths

Tags:

Recently, I've experienced an issue with AWS Athena when there is quite high number of partitions.

The old version had a database and tables with only 1 partition level, say id=x. Let's take one table; for example, where we store payment parameters per id (product), and there are not plenty of IDs. Assume its around 1000-5000. Now while querying that table with passing id number on where clause like ".. where id = 10". The queries were returned pretty fast actually. Assume we update the data twice a day.

Lately, we've been thinking to add another partition level for day like, "../id=x/dt=yyyy-mm-dd/..". This means that partition number grows xID times per day if a month passes and if we have 3000 IDs, we'd approximately get 3000x30=90000 partitions a month. Thus, a rapid grow in number of partitions.

On, say 3 months old data (~270k partitions), we'd like to see a query like the following would return in at most 20 seconds or so.

select count(*) from db.table where id = x and dt = 'yyyy-mm-dd'

This takes like a minute.

The Real Case

It turns out Athena first fetches the all partitions (metadata) and s3 paths (regardless the usage of where clause) and then filter those s3 paths that you would like to see on where condition. The first part (fetching all s3 paths by partitions lasts long proportionally to the number of partitions)

The more partitions you have, the slower the query executed.

Intuitively, I expected that Athena fetches only s3 paths stated on where clause, I mean this would be the one way of magic of the partitioning. Maybe it fetches all paths

Does anybody know a work around, or do we use Athena in a wrong way ?
Should Athena be used only with small number of partitions ?

Edit

In order to clarify the statement above, I add a piece from support mail.

from Support

... You mentioned that your new system has 360000 which is a huge number. So when you are doing select * from <partitioned table>, Athena first download all partition metadata and searched S3 path mapped with those partitions. This process of fetching data for each partition lead to longer time in query execution. ...

Update

An issue opened on AWS forums. The linked issue raised on aws forums is here.

Thanks.

432

asked Dec 26 '19 12:12

null

1 Answers

This is impossible to properly answer without knowing the amount of data, what file formats, and how many files we're talking about.

TL; DR I suspect you have partitions with thousands of files and that the bottleneck is listing and reading them all.

For any data set that grows over time you should have a temporal partitioning, on date or even time, depending on query patterns. If you should have partitioning on other properties depends on a lot of factors and in the end it often turns out that not partitioning is better. Not always, but often.

Using reasonably sized (~100 MB) Parquet can in many cases be more effective than partitioning. The reason is that partitioning increases the number of prefixes that have to be listed on S3, and the number of files that have to be read. A single 100 MB Parquet file can be more efficient than ten 10 MB files in many cases.

When Athena executes a query it will first load partitions from Glue. Glue supports limited filtering on partitions, and will help a bit in pruning the list of partitions – so to the best of my knowledge it's not true that Athena reads all partition metadata.

When it has the partitions it will issue LIST operations to the partition locations to gather the files that are involved in the query – in other words, Athena won't list every partition location, just the ones in partitions selected for the query. This may still be a large number, and these list operations are definitely a bottleneck. It becomes especially bad if there is more than 1000 files in a partition because that's the page size of S3's list operations, and multiple requests will have to be made sequentially.

With all files listed Athena will generate a list of splits, which may or may not equal the list of files – some file formats are splittable, and if files are big enough they are split and processed in parallel.

Only after all of that work is done the actual query processing starts. Depending on the total number of splits and the amount of available capacity in the Athena cluster your query will be allocated resources and start executing.

If your data was in Parquet format, and there was one or a few files per partition, the count query in your question should run in a second or less. Parquet has enough metadata in the files that a count query doesn't have to read the data, just the file footer. It's hard to get any query to run in less than a second due to the multiple steps involved, but a query hitting a single partition should run quickly.

Since it takes two minutes I suspect you have hundreds of files per partition, if not thousands, and your bottleneck is that it takes too much time to run all the list and get operations in S3.

136

answered Oct 17 '22 06:10

Theo

Related questions
                            
                                How to handling PDO MySQL fail over in AWS using persistent connections
                            
                                AWS SAM : Nested Stacks, Referring to API gateway from the Root stack
                            
                                is is possible to move/copy an s3 bucket to a different account?
                            
                                Elastic Beanstalk's Elastic Load Balancer name
                            
                                AWS Inter Region VPN with VYOS
                            
                                Processing DynamoDB streams using the AWS Java DynamoDB streams Kinesis adapter
                            
                                AWS Cloudformation parameter dependency
                            
                                How to get the target domain name of a custom domain for Regional AWS API Gateway in Cloudformation?
                            
                                AWS CDK how to create an API Gateway backed by Lambda from OpenApi spec?
                            
                                How can I wait for completion of an Elastic MapReduce job flow in a Java application?
                            
                                Where are CloudWatch log data stored?
                            
                                AWS Lambda: class java.lang.ClassNotFoundException
                            
                                AWS Mobile HUD AWS SNS pushManagerDidRegister but endpoint not created
                            
                                Handling CSRF attacks from AWS Lambda?
                            
                                Separate Dockerrun.aws.json files for staging and production
                            
                                How to develop and test AWS AppSync
                            
                                how to call rest api inside aws lambda function using nodejs
                            
                                API Gateway - ALB: Hostname/IP doesn't match certificate's altnames
                            
                                how to access error information in Step Function from previous Function in Catch state
                            
                                Are DynamoDB Updates strongly consistent?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

AWS Athena partition fetch all paths

Tags:

amazon-web-services

nosql

amazon-athena

presto

aws-glue

null

People also ask

1 Answers

Theo

Recent Activity

Donate For Us