I try to hash some unicode strings:
hashlib.sha1(s).hexdigest() UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-81: ordinal not in range(128)
where s
is something like:
œ∑¡™£¢∞§¶•ªº–≠œ∑´®†¥¨ˆøπ“‘åß∂ƒ©˙∆˚¬…æΩ≈ç√∫˜µ≤≥÷åйцукенгшщзхъфывапролджэячсмитьбююю..юбьтијџўќ†њѓѕ'‘“«««\dzћ÷…•∆љl«єђxcvіƒm≤≥ї!@#$©^&*(()––––––––––∆∆∆∆∆∆∆∆∆∆∆∆∆∆∆∆∆∆∆•…÷ћzdzћ÷…•∆љlљ∆•…÷ћzћ÷…•∆љ∆•…љ∆•…љ∆•…∆љ•…∆љ•…љ∆•…∆•…∆•…∆•∆…•÷∆•…÷∆•…÷∆•…÷∆•…÷∆•…÷∆•…÷∆•…
what should I fix?
To allow working with Unicode characters, Python 2 has a unicode type which is a collection of Unicode code points (like Python 3's str type). The line ustring = u'A unicode \u018e string \xf1' creates a Unicode string with 20 characters.
Remarks. If encoding and/or errors are given, unicode() will decode the object which can either be an 8-bit string or a character buffer using the codec for encoding. The encoding parameter is a string giving the name of an encoding; if the encoding is not known, LookupError is raised.
To include Unicode characters in your Python source code, you can use Unicode escape characters in the form \u0123 in your string. In Python 2. x, you also need to prefix the string literal with 'u'.
Unicode is a standard encoding system that is used to represent characters from almost all languages. Every Unicode character is encoded using a unique integer code point between 0 and 0x10FFFF . A Unicode string is a sequence of zero or more code points.
Apparently hashlib.sha1
isn't expecting a unicode
object, but rather a sequence of bytes in a str
object. Encoding your unicode
string to a sequence of bytes (using, say, the UTF-8 encoding) should fix it:
>>> import hashlib >>> s = u'é' >>> hashlib.sha1(s.encode('utf-8')) <sha1 HASH object @ 029576A0>
The error is because it is trying to convert the unicode
object to a str
automatically, using the default ascii
encoding, which can't handle all those non-ASCII characters (since your string isn't pure ASCII).
A good starting point for learning more about Unicode and encodings is the Python docs, and this article by Joel Spolsky.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With