python3 unicode-escape doesn't work with non-ascii bytes?

Question

In python2, there is string-escape and unicode-escape. For utf-8 byte string, string-escape could escape \ and keep non-ascii bytes, like:

"你好\n".decode('string-escape')
'\xe4\xbd\xa0\xe5\xa5\xbd
'

However, in python3, string-escape is removed. We have to encode string into bytes and decode it with unicode-escape:

"This\n".encode('utf_8').decode('unicode_escape')
'This
'

It does work with ascii bytes. But non-ascii bytes will also be escaped:

"你好\n".encode('utf_8')
b'\xe4\xbd\xa0\xe5\xa5\xbd\n'
"你好\n".encode('utf_8').decode('unicode_escape').encode('utf_8')
b'\xc3\xa4\xc2\xbd\xc2\xa0\xc3\xa5\xc2\xa5\xc2\xbd
'

All non-ascii bytes are escaped, which leads to encoding error.

So is there a solution for this ? Is it possible in python3 to keep all non-ascii bytes and decode all escape chars ?

raylu · Accepted Answer

import codecs
codecs.getdecoder('unicode_escape')('你好\n')

python3 unicode-escape doesn't work with non-ascii bytes?

Tags:

python

python-3.x

Ning Sun

1 Answers

raylu

Recent Activity

Donate For Us

python3 unicode-escape doesn't work with non-ascii bytes?

Tags:

python

python-3.x

Ning Sun

1 Answers

raylu

Related questions

Recent Activity

Donate For Us