I ran across something once upon a time and wondered if it was a Python "bug" or at least a misfeature. I'm curious if anyone knows of any justifications for this behavior. I thought of it just now reading "Code Like a Pythonista," which has been enjoyable so far. I'm only familiar with the 2.x line of Python.
Raw strings are strings that are prefixed with an r
. This is great because I can use backslashes in regular expressions and I don't need to double everything everywhere. It's also handy for writing throwaway scripts on Windows, so I can use backslashes there also. (I know I can also use forward slashes, but throwaway scripts often contain content cut&pasted from elsewhere in Windows.)
So great! Unless, of course, you really want your string to end with a backslash. There's no way to do that in a 'raw' string.
In [9]: r'\n' Out[9]: '\\n' In [10]: r'abc\n' Out[10]: 'abc\\n' In [11]: r'abc\' ------------------------------------------------ File "<ipython console>", line 1 r'abc\' ^ SyntaxError: EOL while scanning string literal In [12]: r'abc\\' Out[12]: 'abc\\\\'
So one backslash before the closing quote is an error, but two backslashes gives you two backslashes! Certainly I'm not the only one that is bothered by this?
Thoughts on why 'raw' strings are 'raw, except for backslash-quote'? I mean, if I wanted to embed a single quote in there I'd just use double quotes around the string, and vice versa. If I wanted both, I'd just triple quote. If I really wanted three quotes in a row in a raw string, well, I guess I'd have to deal, but is this considered "proper behavior"?
This is particularly problematic with folder names in Windows, where the backslash is the path delimeter.
If you need to end a raw string with a single backslash, you can use two and slice off the second.
Escape sequences In Python, characters that cannot be represented in a normal string (such as tabs, line feeds. etc.) are described using an escape sequence with a backslash \ (such as \t or \n ), similar to the C language.
Python raw string is created by prefixing a string literal with 'r' or 'R'. Python raw string treats backslash (\) as a literal character. This is useful when we want to have a string that contains backslash and don't want it to be treated as an escape character.
A Python raw string is a normal string, prefixed with a r or R. This treats characters such as backslash ('\') as a literal character. This also means that this character will not be treated as a escape character.
It's a FAQ.
And in response to "you really want your string to end with a backslash. There's no way to do that in a 'raw' string.": the FAQ shows how to workaround it.
>>> r'ab\c' '\\' == 'ab\\c\\' True >>>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With