Decoding escaped unicode in Python 3 from a non-ascii string

Question

I have been searching for hours now to find a way to fully reverse the result of a str.encode-call like this:

"testäch基er".encode("cp1252", "backslashreplace")

The result is

b'test\xe4ch\u57faer'

now i want to convert it back with

b'test\xe4ch\u57faer'.decode("cp1252")

and I get

'testäch\u57faer'

So how do I get my 基 back? I'm getting nearly there by using decode("unicode-escape") instead (it would work for this example), but that assumes bytes encoded with iso8859-1 not cp1252, so any characters between 80 and 9F would be wrong.

bobince · Accepted Answer

Well...

>>> b'test\xe4ch\u57faer'.decode('unicode-escape')
'testäch基er'

But backslashreplace->unicode-escape is not a consistent round trip. If you have backslashes in the original string, they won't get encoded by backslashreplace but they will get decoded by unicode-escape, and replaced with unexpected characters.

>>> '☃ \u2603'.encode('cp1252', 'backslashreplace').decode('unicode-escape')
'☃ ☃'

There is no way to reliably reverse the encoding of string that has been encoded with an errors fallback such as backslashreplace. That's why it's a fallback, if you could consistently encode and decode to it, it would have been a real encoding.

Bachsau · Answer

I was still very new to Python when I asked this question. Now I understand that these fallback mechanisms are just meant for handling unexpected errors, not something to save and restore data. If you really need a simple and reliable way to encode single unicode characters in ASCII, have a look at the quote and unquote functions from the urllib.parse module.

Decoding escaped unicode in Python 3 from a non-ascii string

Tags:

python

escaping

encoding

unicode

decode

Bachsau

2 Answers

bobince

Bachsau

Recent Activity

Donate For Us

Decoding escaped unicode in Python 3 from a non-ascii string

Tags:

python

escaping

encoding

unicode

decode

Bachsau

2 Answers

bobince

Bachsau

Related questions

Recent Activity

Donate For Us