I've met a problem with re module in Python 3.6.5.
I have this pattern in my regular expression:
'\\nRevision: (\d+)\\n'
But when I run it, I'm getting a DeprecationWarning.
I searched for the problem on SO, and haven't found the answer, actually - what should I use instead of \d+? Just [0-9]+ or maybe something else?
A string contains a literal character that is a reserved character in the Regex class (for example, the '(' or open parentheses character). Placing a '\' (backslash) in front of the character in the regular expression generates an 'Invalid escape sequence' compilation error.
To insert characters that are illegal in a string, use an escape character. An escape character is a backslash \ followed by the character you want to insert.
Any character (except for the newline character) will be matched by a period in a regular expression; when you literally want a period in a regular expression you need to precede it with a backslash. Many times you'll need to express the idea of the beginning or end of a line or word in a regular expression.
Python 3 interprets string literals as Unicode strings, and therefore your \d is treated as an escaped Unicode character.
Declare your RegEx pattern as a raw string instead by prepending r, as below:
r'\nRevision: (\d+)\n'
This also means you can drop the escapes for \n as well since these will just be parsed as newline characters by re.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With