Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get all s3 buckets given a prefix

Currently we have multiple buckets with an application prefix and a region suffix e.g. Bucket names

  • myapp-us-east-1
  • myapp-us-west-1

Is there a way of finding all buckets given a certain prefix? Is there something like:

s3 = boto3.resource('s3')
buckets = s3.buckets.filter(Prefix="myapp-")
like image 663
RAbraham Avatar asked Mar 16 '16 17:03

RAbraham


People also ask

How do I find all S3 buckets?

An S3 bucket can be accessed through its URL. The URL format of a bucket is either of two options: http://s3.amazonaws.com/[bucket_name]/ http://[bucket_name].s3.amazonaws.com/

What is my S3 bucket prefix?

A key prefix is a string of characters that can be the complete path in front of the object name (including the bucket name). For example, if an object (123. txt) is stored as BucketName/Project/WordFiles/123. txt, the prefix might be “BucketName/Project/WordFiles/123.

What does S3 Listobjects return?

Returns some or all (up to 1,000) of the objects in a bucket. You can use the request parameters as selection criteria to return a subset of the objects in a bucket. A 200 OK response can contain valid or invalid XML.

How many transactions per second TPS are supported per Amazon S3 partitioned prefix?

Amazon S3 automatically scales to high request rates. For example, your application can achieve at least 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per prefix in a bucket.


2 Answers

The high level collection s3.buckets.filter(Filters=somefilter) only work for ways that document under describe_tags Filters (list). In such case, you MUST tag your bucket (s3.BucketTagging) before you can use the very specific filtering method s3.buckets.filter(Filters=formatted_tag_filter) (http://boto3.readthedocs.org/en/latest/reference/services/ec2.html#EC2.Client)

IMHO, tagging is a MUST if you plan to manage any resource inside AWS.

Currently, you can do this

s3 = boto3.resource('s3')
for bucket in s3.buckets.all(): 
    if bucket.name.startswith("myapp-"):
        print bucket.name

And following is example code given to filter out KEYS (not bucket) (http://boto3.readthedocs.org/en/latest/guide/collections.html)

# S3 list all keys with the prefix '/photos'
s3 = boto3.resource('s3')
for bucket in s3.buckets.all():
    if bucket.name.startswith("myapp-") :
        for obj in bucket.objects.filter(Prefix='/photos'):
            print('{0}:{1}'.format(bucket.name, obj.key))

And there is a warning note using the above example :

Warning Behind the scenes, the above example will call ListBuckets, ListObjects, and HeadObject many times. If you have a large number of S3 objects then this could incur a significant cost.

like image 61
mootmoot Avatar answered Sep 24 '22 10:09

mootmoot


When you retrieve a list of buckets from the S3 service, you're using the GET / operation on the S3 service.

Docs: http://docs.aws.amazon.com/AmazonS3/latest/API/RESTServiceGET.html

This function does not take any request parameters, so there is no filtering done server-side.

If you want to filter based on your desired prefix, you'll need to retrieve the entire list of buckets, then filter it yourself.

like image 39
Matt Houser Avatar answered Sep 25 '22 10:09

Matt Houser