Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

filter a glob-like regex pattern in boto3

Tags:

glob

boto3

Can I use boto3's filter tool for finding keys (technically sub-keys) in a bucket akin to files in a directory using glob?

I want to get a list of keys with a pattern like this "key/**/<pattern>/**.gz".

like image 388
leon yin Avatar asked Feb 16 '16 20:02

leon yin


2 Answers

Unfortunately not. S3 provides no server-side support for filtering of results (other than by prefix and delimiter).

like image 122
garnaat Avatar answered Sep 19 '22 12:09

garnaat


You can use the exrex library to generate all strings based on a regex and pass that to boto3. This is a simple example but you can imagine something a bit more complicated:

For example:

import exrex
import boto3
session = boto3.Session() # profile_name='xyz'
s3 = session.resource('s3')
bucket = s3.Bucket('mybucketname')

prefixes = list(exrex.generate(r'api/v2/responses/2016-11-08/(2016-11-08T2[2-3]|2016-11-09)'))

objects = []
for prefix in prefixes:
    print(prefix, end=" ")
    current_objects = list(bucket.objects.filter(Prefix=prefix))
    print(len(current_objects))
    objects += current_objects

This gives output:

api/v2/responses/2016-11-08/2016-11-08T22 1056
api/v2/responses/2016-11-08/2016-11-08T23 1056
api/v2/responses/2016-11-08/2016-11-09 24677
like image 44
Al Johri Avatar answered Sep 19 '22 12:09

Al Johri