Can I use boto3's filter tool for finding keys (technically sub-keys) in a bucket akin to files in a directory using glob?
I want to get a list of keys with a pattern like this "key/**/<pattern>/**.gz"
.
Unfortunately not. S3 provides no server-side support for filtering of results (other than by prefix and delimiter).
You can use the exrex library to generate all strings based on a regex and pass that to boto3. This is a simple example but you can imagine something a bit more complicated:
For example:
import exrex
import boto3
session = boto3.Session() # profile_name='xyz'
s3 = session.resource('s3')
bucket = s3.Bucket('mybucketname')
prefixes = list(exrex.generate(r'api/v2/responses/2016-11-08/(2016-11-08T2[2-3]|2016-11-09)'))
objects = []
for prefix in prefixes:
print(prefix, end=" ")
current_objects = list(bucket.objects.filter(Prefix=prefix))
print(len(current_objects))
objects += current_objects
This gives output:
api/v2/responses/2016-11-08/2016-11-08T22 1056
api/v2/responses/2016-11-08/2016-11-08T23 1056
api/v2/responses/2016-11-08/2016-11-09 24677
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With