I am trying to implement string unescaping with Python regex and backreferences, and it doesn't seem to want to work very well. I'm sure it's something I'm doing wrong but I can't figure out what...
>>> import re
>>> mystring = r"This is \n a test \r"
>>> p = re.compile( "\\\\(\\S)" )
>>> p.sub( "\\1", mystring )
'This is n a test r'
>>> p.sub( "\\\\\\1", mystring )
'This is \\n a test \\r'
>>> p.sub( "\\\\1", mystring )
'This is \\1 a test \\1'
I'd like to replace \\[char] with \[char], but backreferences in Python don't appear to follow the same rules they do in every other implementation I've ever used. Could someone shed some light?
escape() was changed to escape only characters which are meaningful to regex operations. Note that re. escape will turn e.g. a newline into a backslash followed by a newline; one might well instead want a backslash followed by a lowercase n.
To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).
Use the string method startswith() for forward matching, i.e., whether a string starts with the specified string. You can also specify a tuple of strings. True is returned if the string starts with one of the elements of the tuple, and False is returned if the string does not start with any of them.
Isn't that what Anders' second example does?
In 2.5 there's also a string-escape
encoding you can apply:
>>> mystring = r"This is \n a test \r"
>>> mystring.decode('string-escape')
'This is \n a test \r'
>>> print mystring.decode('string-escape')
This is
a test
>>>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With