I have a problem in Python with Unicode. I need plot a graph with Unicode annotations in it. According to the tutorial I should just create my string in Unicode. I do it like this:
annotation = u"%s has %s rev"%(art.title, len(art.revisions))
It is art.title
that has Unicode characters in it. Sometimes that code works, sometimes it gives me the error below:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 19: ordinal not in range(128)
How can I fix it?
EDIT: I have error exactly after "annotation" line:
File "script.py", line 195, in test_trie
annotation = u"%s has %s rev"%(art.title, len(art.revisions))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 19: ordinal not in range(128)
Inserting Unicode characters To insert a Unicode character, type the character code, press ALT, and then press X. For example, to type a dollar symbol ($), type 0024, press ALT, and then press X. For more Unicode character codes, see Unicode character code charts by script.
To summarize the previous section: a Unicode string is a sequence of code points, which are numbers from 0 through 0x10FFFF (1,114,111 decimal). This sequence of code points needs to be represented in memory as a set of code units, and code units are then mapped to 8-bit bytes.
Python supports the string type and the unicode type. A string is a sequence of chars while a unicode is a sequence of "pointers". The unicode is an in-memory representation of the sequence and every symbol on it is not a char but a number (in hex format) intended to select a char in a map.
Unicode is a standard encoding system that is used to represent characters from almost all languages. Every Unicode character is encoded using a unique integer code point between 0 and 0x10FFFF . A Unicode string is a sequence of zero or more code points.
You have two options: Either use art.title.decode('utf_8')
, or create a new Unicode string with UTF-8 encoding by unicode(art.title, 'utf_8')
.
I think it depends if your title has a unicode characters or not.
I would try adding art.title.encode("utf-8")
or art.title.decode("utf-8")
and see how it works
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With