I have a Python 2.7 program that writes out data from various external applications. I continually get bitten by exceptions when I write to a file until I add .decode(errors="ignore")
to the string being written out. (FWIW, opening the file as mode="wb"
doesn't fix this.)
Is there a way to say "ignore encoding errors on all strings in this scope"?
As mentioned in my thread on the issue the hack from Sven Marnach is even possible without a new function:
import codecs
codecs.register_error("strict", codecs.ignore_errors)
You cannot redefine methods on built-in types, and you cannot change the default value of the errors
parameter to str.decode()
. There are other ways to achieve the desired behaviour, though.
The slightly nicer way: Define your own decode()
function:
def decode(s, encoding="ascii", errors="ignore"):
return s.decode(encoding=encoding, errors=errors)
Now, you will need to call decode(s)
instead of s.decode()
, but that's not too bad, isn't it?
The hack: You can't change the default value of the errors
parameter, but you can overwrite what the handler for the default errors="strict"
does:
import codecs
def strict_handler(exception):
return u"", exception.end
codecs.register_error("strict", strict_handler)
This will essentially change the behaviour of errors="strict"
to the standard "ignore"
behaviour. Note that this will be a global change, affecting all modules you import.
I recommend neither of these two ways. The real solution is to get your encodings right. (I'm well aware that this isn't always possible.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With