Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get count of objects in a specific S3 folder using Boto3

Trying to get count of objects in S3 folder

Current code

bucket='some-bucket'
File='someLocation/File/'

objs = boto3.client('s3').list_objects_v2(Bucket=bucket,Prefix=File)
fileCount = objs['KeyCount']

This gives me the count as 1+actual number of objects in S3.

Maybe it is counting "File" as a key too?

like image 939
ThatComputerGuy Avatar asked Feb 12 '19 18:02

ThatComputerGuy


People also ask

How do I count objects in S3 folder?

Open the AWS S3 console and click on your bucket's name. In the Objects tab, click the top row checkbox to select all files and folders or select the folders you want to count the files for. Click on the Actions button and select Calculate total size.

What is boto3 resource (' S3 ')?

Resources are a higher-level abstraction compared to clients. They are generated from a JSON resource description that is present in the boto library itself. E.g. this is the resource definition for S3.

Is there a limit to the number of files in S3 bucket?

S3 provides unlimited scalability, and there is no official limit on the amount of data and number of objects you can store in an S3 bucket. The size limit for objects stored in a bucket is 5 TB.


Video Answer


2 Answers

If there are more than 1000 entries, you need to use paginators, like this:

count = 0
client = boto3.client('s3')
paginator = client.get_paginator('list_objects')
for result in paginator.paginate(Bucket='your-bucket', Prefix='your-folder/', Delimiter='/'):
    count += len(result.get('CommonPrefixes'))
like image 50
matt burns Avatar answered Sep 17 '22 19:09

matt burns


Assuming you want to count the keys in a bucket and don't want to hit the limit of 1000 using list_objects_v2. The below code worked for me but I'm wondering if there is a better faster way to do it! Tried looking if there's a packaged function in boto3 s3 connector but there isn't!

# connect to s3 - assuming your creds are all set up and you have boto3 installed
s3 = boto3.resource('s3')

# identify the bucket - you can use prefix if you know what your bucket name starts with
for bucket in s3.buckets.all():
    print(bucket.name)

# get the bucket
bucket = s3.Bucket('my-s3-bucket')

# use loop and count increment
count_obj = 0
for i in bucket.objects.all():
    count_obj = count_obj + 1
print(count_obj)
like image 36
vagabond Avatar answered Sep 16 '22 19:09

vagabond