I'm trying to list objects in an Amazon s3 bucket in python using boto3
.
It seems boto3
has 2 functions for listing the objects in a bucket: list_objects()
and list_objects_v2()
.
What is the difference between the 2 and what is the benefit of using one over the other?
Boto3 is the official AWS SDK for Python, used to create, configure, and manage AWS services. The following are examples of defining a resource/client in boto3 for the Weka S3 service, managing credentials, and pre-signed URLs, generating secure temporary tokens, and using those to run S3 API calls.
put_object` does not overwrite the existing data in the bucket.
An Amazon S3 bucket is a storage location to hold files. S3 files are referred to as objects. This section describes how to use the AWS SDK for Python to perform common operations on S3 buckets.
Comparison side by side.
list_objects() :
response = client.list_objects( Bucket='string', Delimiter='string', EncodingType='url', #Marker to list continuous page Marker='string', MaxKeys=123, Prefix='string' )
list_objects_v2()
response = client.list_objects_v2( Bucket='string', Delimiter='string', EncodingType='url', MaxKeys=123, Prefix='string', # Replace marker to list continuous page ContinuationToken='string', # set to True to fetch key owner info. Default is False. FetchOwner=True|False, # This is similar to the Marker in list_object() StartAfter='string' )
Added features. Due to the 1000 keys per page listing limits, using marker to list multiple pages can be an headache. Logically, you need to keep track the last key you successfully processed. With ContinuationToken
, you don't need to know the last key, you just check existence of NextContinuationToken
in the response. You can spawn parallel process to deal with multiply of 1000 keys without dealing with the last key to fetch next page.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With