How to reverse re.escape? This blog from 2007 says there is no reverse function, but is that still true, ten years later?
Python 2's decode('string_escape')
doesn't work on all escaped chars (such as space).
>>> re.escape(' ')
'\\ '
>>> re.escape(' ').decode('string-escape')
'\\ '
Python 3: Some suggest unicode_escape
or codec.escape_decode
or ast.literal_eval
but no luck with spaces.
>>> re.escape(b' ')
b'\\ '
>>> re.escape(b' ').decode('unicode_escape')
'\\ '
>>> codecs.escape_decode(re.escape(b' '))
(b'\\ ', 2)
>>> ast.literal_eval(re.escape(b' '))
ValueError: malformed node or string: b'\\ '
So is this really the only thing that works?
>>> re.sub(r'\\(.)', r'\1', re.escape(' '))
' '
The 'r' at the start of the pattern string designates a python "raw" string which passes through backslashes without change which is very handy for regular expressions (Java needs this feature badly!). I recommend that you always write pattern strings with the 'r' just as a habit.
Regular Expression Syntax. A regular expression (or RE) specifies a set of strings that matches it; the functions in this module let you check if a particular string matches a given regular expression (or if a given regular expression matches a particular string, which comes down to the same thing).
String Equals Check in Python In python programming we can check whether strings are equal or not using the “==” or by using the “. __eq__” function. Example: s1 = 'String' s2 = 'String' s3 = 'string' # case sensitive equals check if s1 == s2: print('s1 and s2 are equal.
So is this really the only thing that works?
>>> re.sub(r'\\(.)', r'\1', re.escape(' ')) ' '
Yes. The source for the re
module contains no unescape()
function, so you're definitely going to have to write one yourself.
Furthermore, the re.escape()
function uses str.translate()
…
def escape(pattern):
"""
Escape special characters in a string.
"""
if isinstance(pattern, str):
return pattern.translate(_special_chars_map)
else:
pattern = str(pattern, 'latin1')
return pattern.translate(_special_chars_map).encode('latin1')
… which, while it can transform a single character into multiple characters (e.g. [
→ \[
), cannot perform the reverse of that operation.
Since there's no direct reversal of escape()
available via str.translate()
, a custom unescape()
function using re.sub()
, as described in your question, is the most straightforward solution.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With