Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to unescape systemd names?

Tags:

python

Systemd documents its main rule for escaping non-alphanumerical characters in unit names this way:

any "/" character is replaced by "-", and all other characters which are not ASCII alphanumerics or "_" are replaced by C-style "\x2d" escapes.

and there is also an example of unescaping:

$ systemd-escape -u 'Hall\xc3\xb6chen\x2c\x20Meister'
Hallöchen, Meister

(More info in the docs here and here)

Let's ignore the trivial replacement "/" -> "-". I'm trying to unescape systemd names in Python (without 3rd party libraries). Many solutions I tried did not work, they converted the two bytes UTF-8 "ö" to two characters.

Finally this seems to produce the correct answer:

>>> esc=r'Hall\xc3\xb6chen\x2c\x20Meister'
>>> esc.encode('latin-1').decode('unicode_escape').encode('latin-1').decode('utf-8')
'Hallöchen, Meister'

As you see it goes: str -> bytes -> str -> bytes -> str. Could it be simplified somehow?

like image 666
VPfB Avatar asked Nov 06 '22 11:11

VPfB


1 Answers

Instead of raw stringesc = r'...' use bytes string esc = b'...' like in python3 example:

>>> esc = b'Hall\xc3\xb6chen\x2c\x20Meister'
>>> esc.decode('utf-8')
'Hallöchen, Meister'
like image 107
IVI Avatar answered Dec 05 '22 04:12

IVI