Systemd documents its main rule for escaping non-alphanumerical characters in unit names this way:
any "/" character is replaced by "-", and all other characters which are not ASCII alphanumerics or "_" are replaced by C-style "\x2d" escapes.
and there is also an example of unescaping:
$ systemd-escape -u 'Hall\xc3\xb6chen\x2c\x20Meister' Hallöchen, Meister
(More info in the docs here and here)
Let's ignore the trivial replacement "/"
-> "-"
. I'm trying to unescape systemd names in Python (without 3rd party libraries). Many solutions I tried did not work, they converted the two bytes UTF-8 "ö"
to two characters.
Finally this seems to produce the correct answer:
>>> esc=r'Hall\xc3\xb6chen\x2c\x20Meister'
>>> esc.encode('latin-1').decode('unicode_escape').encode('latin-1').decode('utf-8')
'Hallöchen, Meister'
As you see it goes: str -> bytes -> str -> bytes -> str
. Could it be simplified somehow?
Instead of raw stringesc = r'...'
use bytes string esc = b'...'
like in python3 example:
>>> esc = b'Hall\xc3\xb6chen\x2c\x20Meister'
>>> esc.decode('utf-8')
'Hallöchen, Meister'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With