Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

botocore s3 put has issue hashing file due to encoding?

I'm having trouble figuring out why the file, the contents of which are "DELETE ME LATER", which is loaded with encoding utf-8 causes an exception in botocore when it's being hashed.

with io.open('deleteme','r', encoding='utf-8') as f:
  try:
    resp=client.put_object(
    Body=f,
    Bucket='s3-bucket-actual-name-for-real',
    Key='testing/a/put'
    )
    print('deleteme exists')
    print(resp)
  except:
    print('deleteme could not put')
    raise

Produces:

deleteme could not put
Traceback (most recent call last): File
"./test_operator.py", line 41, in
Key='testing/a/put' File "/Users/lamblin/VEnvs/awscli/lib/python3.6/site-packages/botocore/client.py",
line 312, in _api_call
return self._make_api_call(operation_name, kwargs) File "/Users/lamblin/VEnvs/awscli/lib/python3.6/site-packages/botocore/client.py",
line 582, in _make_api_call request_signer=self._request_signer, context=request_context) File
"/Users/lamblin/VEnvs/awscli/lib/python3.6/site-packages/botocore/hooks.py",
line 242, in emit_until_response
responses = self._emit(event_name, kwargs, stop_on_response=True) File
"/Users/lamblin/VEnvs/awscli/lib/python3.6/site-packages/botocore/hooks.py",
line 210, in _emit
response = handler(**kwargs) File "/Users/lamblin/VEnvs/awscli/lib/python3.6/site-packages/botocore/handlers.py",
line 201, in conditionally_calculate_md5
calculate_md5(params, **kwargs) File "/Users/lamblin/VEnvs/awscli/lib/python3.6/site-packages/botocore/handlers.py",
line 179, in calculate_md5
binary_md5 = _calculate_md5_from_file(body) File "/Users/lamblin/VEnvs/awscli/lib/python3.6/site-packages/botocore/handlers.py",
line 193, in _calculate_md5_from_file md5.update(chunk)
TypeError: Unicode-objects must be encoded before hashing

Now this can be avoided by opening the file with 'rb' but, isn't the file object f clearly using an encoding?

like image 958
dlamblin Avatar asked Nov 27 '17 20:11

dlamblin


1 Answers

Now this can be avoided by opening the file with 'rb' but, isn't the file object f clearly using an encoding?

The encoding specified to io.open in mode='r' is used to decode the content. So when you iterate f, the content has already been converted from bytes to str (text) by Python.

To interface with botocore directly, open your file with mode 'rb', and drop the encoding kwarg. There is no point to decode it to text when the first thing botocore will have to do in order to transport the content is just encode back into bytes again.

like image 168
wim Avatar answered Oct 02 '22 14:10

wim