Does anyone know why the string conversion functions throw exceptions when errors="ignore" is passed? How can I convert from regular Python string objects to unicode without errors being thrown? Thanks very much!
python -c "import codecs; codecs.open('tmp', 'wb', encoding='utf8', errors='ignore').write('кошка')"
returns
Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/python2.6/codecs.py", line 686, in write
return self.writer.write(data)
File "/usr/lib/python2.6/codecs.py", line 351, in write
data, consumed = self.encode(object, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 0: ordinal not in range(128)
EDIT -- thanks for the responses, but does anyone know how to convert the literal above, not using the "u" prefix? The reason being is that you could, of course, be dealing with something that wasn't a constant :)
The write
method (in Python 2) takes a unicode object, and you're passing it a str -- so the encode
call in codecs.py
line 351 is first trying to build a unicode object (with the default codec, 'ascii'). Fix is easy: change the write
call to
write(u'кошка')
The u
prefix tells Python you're using a Unicode object, and it should be fine.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With