Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to fix "<string> DeprecationWarning: invalid escape sequence" in Python?

I'm getting lots of warnings like this in Python:

DeprecationWarning: invalid escape sequence \A   orcid_regex = '\A[0-9]{4}-[0-9]{4}-[0-9]{4}-[0-9]{3}[0-9X]\Z'  DeprecationWarning: invalid escape sequence \/   AUTH_TOKEN_PATH_PATTERN = '^\/api\/groups'  DeprecationWarning: invalid escape sequence \   """  DeprecationWarning: invalid escape sequence \.   DOI_PATTERN = re.compile('(https?://(dx\.)?doi\.org/)?10\.[0-9]{4,}[.0-9]*/.*')  <unknown>:20: DeprecationWarning: invalid escape sequence \(  <unknown>:21: DeprecationWarning: invalid escape sequence \( 

What do they mean? And how can I fix them?

like image 519
Sean Hammond Avatar asked Sep 14 '18 16:09

Sean Hammond


People also ask

What is invalid escape sequence in Python?

Placing a '\' (backslash) in front of the character in the regular expression generates an 'Invalid escape sequence' compilation error. This only occurs when the regular expression is used in the text of the script.

How do you escape a string sequence in Python?

To insert characters that are illegal in a string, use an escape character. An escape character is a backslash \ followed by the character you want to insert.

How do you avoid escape characters in a string Python?

For turning a normal string into a raw string, prefix the string (before the quote) with an r or R. This is the method of choice for overcoming this escape sequence problem.


1 Answers

\ is the escape character in Python string literals.

For example if you want to put a tab character in a string you would do:

>>> print("foo \t bar") foo      bar 

If you want to put a literal \ in a string you have to use \\:

>>> print("foo \\ bar") foo \ bar 

Or use a "raw string":

>>> print(r"foo \ bar") foo \ bar 

You can't just go putting backslashes in string literals whenever you want one. A backslash isn't valid when not followed by one of the valid escape sequences, and newer versions of Python print a deprecation warning. For example \A isn't an escape sequence:

$ python3.6 -Wd -c '"\A"' <string>:1: DeprecationWarning: invalid escape sequence \A 

If your backslash sequence does accidentally match one of Python's escape sequences, but you didn't mean it to, that's even worse.

So you should always use raw strings or \\.

It's important to remember that a string literal is still a string literal even if that string is intended to be used as a regular expression. Python's regular expression syntax supports lots of special sequences that begin with \. For example \A matches the start of a string. But \A is not valid in a Python string literal! This is invalid:

my_regex = "\Afoo" 

Instead you should do this:

my_regex = r"\Afoo" 

Docstrings are another one to remember: docstrings are string literals too, and invalid \ sequences are invalid in docstrings too! Use raw strings (r"""...""") for docstrings if they contain \'s.

like image 62
Sean Hammond Avatar answered Sep 20 '22 14:09

Sean Hammond