Howto put object to s3 with Content-MD5

Question

I have tried to upload an XML File to S3 using boto3. As recommended by Amazon, I would like to send a Base64 Encoded MD5-128 Bit Digest(Content-MD5) of the data.

https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectPUT.html https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Object.put

My Code:

with open(file, 'rb') as tempfile:
   body = tempfile.read()
tempfile.close()

hash_object = hashlib.md5(body)
base64_md5 = base64.encodebytes(hash_object.digest())

response = s3.Object(self.bucket, self.key + file).put(
            Body=body.decode(self.encoding),
            ACL='private',
            Metadata=metadata,
            ContentType=self.content_type,
            ContentEncoding=self.encoding,
            ContentMD5=str(base64_md5)
        )

When i try this the str(base64_md5) create a string like 'b'ZpL06Osuws3qFQJ8ktdBOw== ''

In this case, I get this Error Message:

An error occurred (InvalidDigest) when calling the PutObject operation: The Content-MD5 you specified was invalid.

For Test purposes I copied only the Value without the 'b' in front: 'ZpL06Osuws3qFQJ8ktdBOw== '

Then i get this Error Message:

botocore.exceptions.HTTPClientError: An HTTP Client raised and unhandled exception: Invalid header value b'hvUe19qHj7rMbwOWVPEv6Q== '

Can anyone help me how to save Upload a File to S3?

Thanks,

Oliver

tedder42 · Accepted Answer

Starting with @Isaac Fife's example, stripping it down to identify what's required vs not, and to include imports and such to make it a full reproducible example:

(the only change you need to make is to use your own bucket name)

import base64
import hashlib
import boto3

contents = "hello world!"
md = hashlib.md5(contents.encode('utf-8')).digest()
contents_md5 = base64.b64encode(md).decode('utf-8')

boto3.client('s3').put_object(
  Bucket="mybucket",
  Key="test",
  Body=contents,
  ContentMD5=contents_md5
)

Learnings: first, the MD5 you are trying to generate will NOT look like what an 'upload' returns. We actually need a base64 version, it returns a md.hexdigest() version. hex is base16, which is not base64.

Isaac Fife · Answer

(Python 3.7)

Took me hours to figure this out because the only error you get is "The Content-MD5 you specified was invalid." Super useful for debugging... Anyway, here is the code I used to actually get the file to upload correctly before refactoring.

json_results = json_converter.convert_to_json(result)
json_results_utf8 = json_results.encode('utf-8')
content_md5 = md5.get_content_md5(json_results_utf8)
content_md5_string = content_md5.decode('utf-8')
metadata = {
    "md5chksum": content_md5_string
}
s3 = boto3.resource('s3', config=Config(signature_version='s3v4'))
obj = s3.Object(bucket, 'filename.json')
obj.put(
    Body=json_results_utf8,
    ContentMD5=content_md5_string,
    ServerSideEncryption='aws:kms',
    Metadata=metadata,
    SSEKMSKeyId=key_id)

and the hashing

def get_content_md5(data):
    digest = hashlib.md5(data).digest()
    return base64.b64encode(digest)

The hard part for me was figuring out what encoding you need at each step in the process and not being very familiar with how strings are stored in python at the time.

get_content_md5 takes a utf-8 bytes-like object only, and returns the same. But to pass the md5 hash to aws, it needs to be a string. You have to decode it before you give it to ContentMD5.

Pro-tip - Body on the other hand, needs to be given bytes or a seekable object. Make sure if you pass a seekable object that you seek(0) to the beginning of the file before you pass it to AWS or the MD5 will not match. For that reason, using bytes is less error prone, imo.

Howto put object to s3 with Content-MD5

Tags:

python

amazon-web-services

amazon-s3

boto3

Meschkov

2 Answers

tedder42

Isaac Fife

Recent Activity

Donate For Us

Howto put object to s3 with Content-MD5

Tags:

python

amazon-web-services

amazon-s3

boto3

Meschkov

2 Answers

tedder42

Isaac Fife

Related questions

Recent Activity

Donate For Us