Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to retrieve bucket prefixes in a filesystem style using boto3

Doing something like the following:

s3 = boto3.resource('s3')
bucket = s3.Bucket('a_dummy_bucket')
bucket.objects.all()

Will return all the objects under 'a_dummy_bucket' bucket, like:

test1/blah/blah/afile45645.zip
test1/blah/blah/afile23411.zip
test1/blah/blah/afile23411.zip
[...] 2500 files
test2/blah/blah/afile.zip
[...] 2500 files
test3/blah/blah/afile.zip
[...] 2500 files

Is there any way of getting, in this case, 'test1','test2', 'test3', etc... without paginating over all results? For reaching 'test2' I need 3 paginated calls, each one with 1000 keys to know that there is a 'test2', and then other 3 with 1000 keys to reach 'test3', and so on.

How can I get all these prefixes without paginating over all results?

Thanks

like image 987
Pepeluis Avatar asked May 02 '16 20:05

Pepeluis


People also ask

How do you get a bucket name on boto3?

If you bucket is s3://my-bucket-x/ , then use my-bucket-x for the bucket name in boto3 . Save this answer.

What is boto3 resource (' S3 ')?

Python, Boto3, and AWS S3: Demystified At its core, all that Boto3 does is call AWS APIs on your behalf. For the majority of the AWS services, Boto3 offers two distinct ways of accessing these abstracted APIs: Client: low-level service access. Resource: higher-level object-oriented service access.

What is boto3 bucket?

An Amazon S3 bucket is a storage location to hold files. S3 files are referred to as objects. This section describes how to use the AWS SDK for Python to perform common operations on S3 buckets.


1 Answers

I believe getting the Common Prefixes is what you are possibly looking for. Which can be done using this example:

import boto3

client = boto3.client('s3')
paginator = client.get_paginator('list_objects')
result = paginator.paginate(Bucket='my-bucket', Delimiter='/')
for prefix in result.search('CommonPrefixes'):
    print(prefix.get('Prefix'))

AWS Documentation#Bucket.Get says the following regarding Common Prefixes:

A response can contain CommonPrefixes only if you specify a delimiter. When you do, CommonPrefixes contains all (if there are any) keys between Prefix and the next occurrence of the string specified by delimiter. In effect, CommonPrefixes lists keys that act like subdirectories in the directory specified by Prefix. For example, if prefix is notes/ and delimiter is a slash (/), in notes/summer/july, the common prefix is notes/summer/. All of the keys rolled up in a common prefix count as a single return when calculating the number of returns. See MaxKeys.

Type: String

Ancestor: ListBucketResult

like image 84
Cory Shay Avatar answered Sep 29 '22 01:09

Cory Shay