Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can S3 Select search multiple objects?

Tags:

I'm testing out S3 Select and as far as I understand from the examples, you can treat a single object (CSV or JSON) as a data store.

I wanted to have a single JSON document per S3 object and search the entire bucket as a 'database'. I'm saving each 'file' as <ID>.json and each file has JSON documents with the same schema.

Is it possible to search multiple objects in a single call? i.e. Find all JSON documents where customerId = 123 ?

like image 470
Nic Cottrell Avatar asked Jun 06 '18 13:06

Nic Cottrell


People also ask

Can S3 select query multiple files?

Athena can query multiple objects at once, while with S3 select, we can only query a single object (ex. a single flat file)

What are the limitations of S3?

Individual Amazon S3 objects can range in size from a minimum of 0 bytes to a maximum of 5 terabytes. The largest object that can be uploaded in a single PUT is 5 gigabytes. For objects larger than 100 megabytes, customers should consider using the Multipart Upload capability.

Is S3 searchable?

Our solution is built with Amazon S3 event notifications, AWS Lambda, AWS Glue Catalog, and Amazon Athena. These services allow you to search thousands of objects in an S3 bucket by filenames, object metadata, and object keys.

How does S3 select work?

S3 Select is a feature of S3 that lets you specify targeted portions of an S3 object to retrieve and return to you rather than returning the entire contents of the object. You can use some basic SQL expressions to select certain columns and filter for particular records in your structured file.


1 Answers

It appears that Amazon S3 Select operates on only one object.

You can use Amazon Athena to run queries across paths, which will include all files within that path. It also supports partitioning.

like image 160
John Rotenstein Avatar answered Sep 20 '22 12:09

John Rotenstein