I use UTF-8 in my editor, so all strings displayed here are UTF-8 in file.
I have a python script like this:
# -*- coding: utf-8 -*-
...
parser = optparse.OptionParser(
description=_('automates the dice rolling in the classic game "risk"'),
usage=_("usage: %prog attacking defending"))
Then I used xgettext to get everything out and got a .pot file which can be boiled down to:
"Content-Type: text/plain; charset=CHARSET\n"
"Content-Transfer-Encoding: 8bit\n"
#: auto_dice.py:16
msgid "automates the dice rolling in the classic game \"risk\""
msgstr ""
After that, I used msginit to get a de.po
which I filled in like this:
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
#: auto_dice.py:16
msgid "automates the dice rolling in the classic game \"risk\""
msgstr "automatisiert das Würfeln bei \"Risiko\""
Running the script, I get the following error:
File "/usr/lib/python2.6/optparse.py", line 1664, in print_help
file.write(self.format_help().encode(encoding, "replace"))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 60: ordinal not in range(128)
How can I fix that?
UTF-8 is a byte oriented encoding. The encoding specifies that each character is represented by a specific sequence of one or more bytes.
Python Python Decoding Python UTF-8. Created: January-06, 2022. Encoding refers to encoding a string using an encoding scheme such as UTF-8 . Decoding refers to converting an encoded string from one encoding to another encoding scheme.
That error means you've called encode on a bytestring, so it tries to decode it to Unicode using the system default encoding (ascii on Python 2), then re-encode it with whatever you've specified.
Generally, the way to resolve it is to call s.decode('utf-8')
(or whatever encoding the strings are in) before trying to use the strings. It might also work if you just use unicode literals: u'automates...'
(that depends on how strings are substituted from .po files, which I don't know about).
This sort of confusing behaviour is improved in Python 3, which won't try to convert bytes to unicode unless you specifically tell it to.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With