I am working with Amazon S3 uploads and am having trouble with key names being too long. S3 limits the length of the key by bytes, not characters.
From the docs:
The name for a key is a sequence of Unicode characters whose UTF-8 encoding is at most 1024 bytes long.
I also attempt to embed metadata in the file name, so I need to be able to calculate the current byte length of the string using Python to make sure the metadata does not make the key too long (in which case I would have to use a separate metadata file).
How can I determine the byte length of the utf-8 encoded string? Again, I am not interested in the character length... rather the actual byte length used to store the string.
def utf8len(s):
return len(s.encode('utf-8'))
Works fine in Python 2 and 3.
Use the string 'encode' method to convert from a character-string to a byte-string, then use len() like normal:
>>> s = u"¡Hola, mundo!"
>>> len(s)
13 # characters
>>> len(s.encode('utf-8'))
14 # bytes
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With