I would like to use the AWS CLI to query the contents of a bucket and see if a particular file exists, but the bucket contains thousands of files. How can I filter the results to only show key names that match a pattern? For example:
aws s3api list-objects --bucket myBucketName --query "Contents[?Key==*mySearchPattern*]"
By using Amazon S3 Select to filter this data, you can reduce the amount of data that Amazon S3 transfers, which reduces the cost and latency to retrieve this data. Amazon S3 Select works on objects stored in CSV, JSON, or Apache Parquet format.
S3 Select is an Amazon S3 feature that uses simple SQL expressions to retrieve a subset of S3 object content instead of retrieving the entire object. You can use SQL clauses, such as SELECT and WHERE to fetch data from objects stored in CSV, JSON, or Apache Parquet formats.
In the Amazon S3 console, choose your S3 bucket, choose the file that you want to open or download, choose Actions, and then choose Open or Download. If you are downloading an object, specify where you want to save it. The procedure for saving the object depends on the browser and operating system that you are using.
The --query
argument uses JMESPath expressions. JMESPath has an internal function contains
that allows you to search for a string pattern.
This should give the desired results:
aws s3api list-objects --bucket myBucketName --query "Contents[?contains(Key, `mySearchPattern`)]"
(With Linux I needed to use single quotes '
rather than back ticks `
around mySearchPattern
.)
If you want to search for keys starting with certain characters, you can also use the --prefix
argument:
aws s3api list-objects --bucket myBucketName --prefix "myPrefixToSearchFor"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With