Retrieving Etag of an s3 object using boto3 client

Tags:

There is a scenario where I need to verify the checksum(md5) of a file stored in s3 bucket. This can be achieved when uploading the file by specifying the checksum value in the metadata of api call. But in my case, I wanted to verify the checksum after put the data into bucket programmatically. Every object in S3 will have attribute called 'ETag' which is the md5 checksum calculated by S3.

Is there anyway to get the ETag of a specific object and compare the checksum of both local file & file stored in s3 using boto3 client in a python script?

621

asked Sep 19 '18 09:09

S.K. Venkat

2 Answers

Boto3 api has provided a way to get the metadata of an object stored in s3. The following snippet will help to get the metadata via programmatically :

>>> s3_cli = boto3.client('s3')
>>> s3_resp = s3_cli.head_object(Bucket='ventests3', Key='config/ctl.json')
>>> print pprint.pprint(s3_resp)
>>> pp.pprint(s3_resp)
{u'AcceptRanges': 'bytes',
 u'ContentLength': 4325,
 u'ContentType': 'binary/octet-stream',
 u'ETag': '"040c003386f1e2001816d32f2125d07a"',
 u'LastModified': datetime.datetime(2018, 9, 20, 7, 15, 3, tzinfo=tzutc()),
 u'Metadata': {},
 'ResponseMetadata': {'HTTPHeaders': {'accept-ranges': 'bytes',
                                      'content-length': '4325',
                                      'content-type': 'binary/octet-stream',
                                      'date': 'Thu, 20 Sep 2018 07:20:53 GMT',
                                      'etag': '"040c003386f1e2001816d32f2125d07a"',
                                      'last-modified': 'Thu, 20 Sep 2018 07:15:03 GMT',
                                      'server': 'AmazonS3',
                                      'x-amz-id-2': 'P2wapOciWCKPfol2sBgoo11tRdr4KwKcDJ/nHW7LZn00mvKfMYyfAPPV2tIcf3Vu+lrV57NBARY=',
                                      'x-amz-request-id': '42AF970E7C9AA18C'},
                      'HTTPStatusCode': 200,
                      'HostId': 'P2wapOciWCKPfol2sBgoo11tRdr4KwKcDJ/nHW7LZn00mvKfMYyfAPPV2tIcf3Vu+lrV57NBARY=',
                      'RequestId': '42AF970E7C9AA18C',
                      'RetryAttempts': 0}}

>>> s3obj_etag = s3_resp['ETag'].strip('"')
>>> print s3obj_etag
'040c003386f1e2001816d32f2125d07a'

The head_object() method in s3 client object will fetch the metadata (headers) of a given object stored in the s3 bucket.

118

answered Jan 03 '23 19:01

S.K. Venkat

Do not use the ETag of an object in a bucket to determine object equivalence for an object in another bucket (with the same key). In some experiments, I found for large objects the ETag is not consistent from region to region. For example, a large file in a bucket in us-east-1 may have a different ETag when it is copied to us-east-2. The consistency of the ETag value from bucket to bucket varies from object to object. I saw where some large objects do have the same ETag in both regions. I resorted to creating my own tags with the md5sum in it and when I copy an object from one bucket to another, I also copy the tags.

answered Jan 03 '23 17:01

Peter Van Sickel

Related questions
                            
                                Should I be concerned with bit flips on Amazon S3?
                            
                                Save data on S3 using Javascript or Jquery
                            
                                Is it possible to share a Amazon S3 bucket between Amazon S3 users? [closed]
                            
                                How to use AWS iOS SDK to delete an object?
                            
                                How to read large file from Amazon S3?
                            
                                Save uploaded image to S3 with Django
                            
                                Disable progress output aws s3 sync without disabling all output
                            
                                How to copy/move object of amazon s3 having multi level childs to destination?
                            
                                Cannot use AWS SDK in Spring Boot Application (Socket not created by this factory)
                            
                                Boto3 delete object inside directory
                            
                                Upload image to S3 python
                            
                                How to update ACL for all S3 objects in a folder with AWS CLI?
                            
                                In Django, how to get django-storages, boto and easy_thumbnail to work nicely?
                            
                                Which S3 manager creates '_$folder$' files for pseudo-folders?
                            
                                Whats the best way pull files to S3 using FPT/SCP?
                            
                                How to connect to Amazon Redshift or other DB's in Apache Spark?
                            
                                Upload to S3 via shell script without aws-cli, possible?
                            
                                How to deploy to AWS S3 from Codeship?
                            
                                "aws s3 ls" command throwing "InvalidRequest" error message. How to solve it?
                            
                                Access aws s3 public bucket

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Retrieving Etag of an s3 object using boto3 client

Tags:

amazon-s3

checksum

S.K. Venkat

People also ask

2 Answers

S.K. Venkat

Peter Van Sickel

Recent Activity

Donate For Us