Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why doesn't Python auto escape '\' in __doc__?

It seems that some escape chars still matter in docstring. For example, if we run python foo.py (Python 2.7.10), it will emit error like ValueError: invalid \x escape.

def f():
    """
    do not deal with '\x0'
    """
    pass

And in effect, it seem the correct docsting should be:

    """
    do not deal with '\\\\x0'
    """

Additionally it also affects import.

For Python 3.4.3+, the error message is:

  File "foo.py", line 4
    """
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 24-25: truncated \xXX escape

I feel it a bit strange since I was thinking it would only affect __doc__ and have no side effect on the module itself.

Why designed to be so? Is it a flaw/bug in Python?

NOTE

I know the meaning of """ and raw literals, however I think python interpreter should be able to treat docstring specially, at least in theory.

like image 820
Hongxu Chen Avatar asked Nov 16 '15 11:11

Hongxu Chen


People also ask

How do you escape an escape character in Python?

To do this, simply add a backslash ( \ ) before the character you want to escape.

How do you escape an entire string in Python?

In Python strings, the backslash “ ” is a special character, also called the “escape” character. It is used in representing certain whitespace characters: “\t” is a tab, “\n” is a new line, and “\r” is a carriage return. Finally, “ ” can be used to escape itself: “\” is the literal backslash character.

Does Python have escape sequence?

To insert characters that are illegal in a string, use an escape character. An escape character is a backslash \ followed by the character you want to insert.

Which statements prevent the escape sequence interpretation in Python?

The correct option is col1\tcol2\tcol3\t.


1 Answers

From PEP 257:

For consistency, always use """triple double quotes""" around docstrings. Use r"""raw triple double quotes""" if you use any backslashes in your docstrings. For Unicode docstrings, use u"""Unicode triple-quoted strings""" .

There are two forms of docstrings: one-liners and multi-line docstrings.


Also from here:

There's no such python type as "raw string" -- there are raw string literals, which are just one syntax approach (out of many) to specify constants (i.e., literals) that are of string types.

So "getting" something "as a raw string" just makes no sense. You can write docstrings as raw string literals (i.e., with the prefix r -- that's exactly what denotes a raw string literal, the specific syntax that identifies such a constant to the python compiler), or else double up any backslashes in them (an alternative way to specify constant strings including backslash characters), but that has nothing to do with "getting" them one way or another.

like image 134
Remi Crystal Avatar answered Oct 12 '22 13:10

Remi Crystal