I have more than 500,000 objects on s3. I am trying to get the size of each object. I am using the following python code for that
import boto3 bucket = 'bucket' prefix = 'prefix' contents = boto3.client('s3').list_objects_v2(Bucket=bucket, MaxKeys=1000, Prefix=prefix)["Contents"] for c in contents: print(c["Size"])
But it just gave me the size of the top 1000 objects. Based on the documentation we can't get more than 1000. Is there any way I can get more than that?
Q: How much data can I store in Amazon S3? The total volume of data and number of objects you can store are unlimited. Individual Amazon S3 objects can range in size from a minimum of 0 bytes to a maximum of 5 TB. The largest object that can be uploaded in a single PUT is 5 GB.
PDF. Returns some or all (up to 1,000) of the objects in a bucket with each request. You can use the request parameters as selection criteria to return a subset of the objects in a bucket. A 200 OK response can contain valid or invalid XML.
Summary. Microsoft security update MS11-100 limits the maximum number of form keys, files, and JSON members to 1000 in an HTTP request. Because of this change, ASP.NET applications reject requests that have more than 1000 of these elements.
Individual Amazon S3 objects can now range in size from 1 byte all the way to 5 terabytes (TB). Now customers can store extremely large files as single objects, which greatly simplifies their storage experience.
The inbuilt boto3 Paginator
class is the easiest way to overcome the 1000 record limitation of list-objects-v2
. This can be implemented as follows
s3 = boto3.client('s3') paginator = s3.get_paginator('list_objects_v2') pages = paginator.paginate(Bucket='bucket', Prefix='prefix') for page in pages: for obj in page['Contents']: print(obj['Size'])
For more details: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Paginator.ListObjectsV2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With