Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do definitions have a space before the colon in NumPy docstring sections?

Tags:

Numpy docstring guide says:

The colon must be preceded by a space, or omitted if the type is absent.

and gives an example:

Parameters
----------
x : type
    Description of parameter `x`.
y
    Description of parameter `y` (with type not specified)

In the other hand, PEP8 literally says that a space before colon is wrong:

# Wrong:

code:int  # No space after colon
code : int  # Space before colon

I know that this applies to the code, not to docstring, but still why not to be consistent?

Question

What is motivation to put a space before the colon?

It seems to violate typographical rules and also the python convention (or at least intuition).

like image 574
hans Avatar asked Jun 03 '20 07:06

hans


1 Answers

Why a space before the colon?

Because in NumPy syntax definitions inside some docstring sections are made to coincide with the syntax of a reStructuredText Definition List. Notice the syntax is the exact same as reST markup specification for:

Definition Lists

Each definition list item contains a term, optional classifiers, and a definition. A term is a simple one-line word or phrase. Optional classifiers may follow the term on the same line, each after an inline " : " (space, colon, space).

Syntax diagram:

+----------------------------+
| term [ " : " classifier ]* |
+--+-------------------------+--+
   | definition                 |
   | (body elements)+           |
   +----------------------------+

Makes sense since numpydoc clearly states its intended compliance with PEP 257.

numpydoc docstring guide

Overview

We mostly follow the standard Python style conventions as described here:

  • Docstring Conventions - PEP 257

And the PEP states its intent that docstrings should be written with reST constructs:

Abstract, PEP 287

This PEP proposes that the reStructuredText markup be adopted as a standard markup format for structured plaintext documentation in Python docstrings

This can also be verified by quoting numpydoc contributor decisions as they were being taken, for example:

Issue #87

Right now numpydoc format is actually valid rst (just with some special interpretation of certain markup constructs), e.g. the parameters field is a definition list where the type is a "classifier" (http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#definition-lists). I would argue that it is worthwhile to keep this property, which end-of-line backslashes do (they simply do not appear in the string itself), whereas the proposed "recognize indentation" syntax does not.

The same reasoning is mentioned in several places:

PR #107

This probably falls under the category of "if it ain't broke, don't fix it", but I note that we're strangely using blockquotes for parameter listings instead of definition lists. UPDATED: now this PR proposes to use definition lists by default, with a switch to use the legacy blockquotes.

The specific rule of a space before the colon can be seen in the numpydoc.validate.py source code, and in the documentation:

Built-in Validation Checks

"PR10": 'Parameter "{param_name}" requires a space before the colon '
       "separating the parameter name and type"

In conclusion, to write docstrings with reST (to be compliant with PEP 257) there aren't many list markup constructs in the reST Body Elements to choose from. Definition lists are simply the best choice given its term/classifier syntax fits perfectly the name/type listing of Python objects.



Addressing an intuitive objection raised in the question:

In the other hand, PEP8 literally says that a space before colon is wrong

Yes, but the function and variable annotations PEP 8 mentions do not refer to documentation strings (docstrings)! Those are intended for signatures and variable declarations.

like image 188
bad_coder Avatar answered Sep 30 '22 21:09

bad_coder