I am going to write a set of scripts, each independent from the others but with some similarities. The structure will most likely be the same for all the scripts and probably looks like: <pre class="prettyprint"><code># -*- coding: utf-8 -*- """ Small description and information @author: Author """ # Imports import numpy as np import math from scipy import signal ... # Constant definition (always with variable in capital letters) CONSTANT_1 = 5 CONSTANT_2 = 10 # Main class class Test(): def __init__(self, run_id, parameters): # Some stuff not too important def _run(self, parameters): # Main program returning a result object. </code></pre> For each script, I would like to write documentation and export it in PDF. I need a library/module/parser which reads the scripts, extracts the noted comment, code and puts it back together in the desired output format. For instance, in the <code>_run()</code> method, there might be several steps detailed in the comments: <pre class="prettyprint"><code>def _run(self, parameters): # Step 1: we start by doing this code to do it # Step 2: then we do this code to do it code code # this code does that </code></pre> Which library/parser could I use to analyze the python script and output a PDF? At first, I was thinking of sphinx, but it is not suited to my need as I would have to design a custom extension. Moreover, sphinx strength lies in the links and hierarchy between multiple scripts of a same or of different modules. In my case, I will only be documenting one script, one file at a time. Then, my second idea is to use the RST format and RST2PDF to create the PDF. For the parser, I could then design a parser which reads the <code>.py</code> file and extract the commented/decorated lines or set of lines as proposed below, and then write the RST file. <pre class="prettyprint"><code>#-description ## Title of something # doing this here #- #-code some code to extract and put in the doc some more code #- </code></pre> Finally, I would also like to be able to execute some code and catch the result in order to put it in the output PDF file. For instance, I could run a python code to compute the SHA1 hash of the <code>.py</code> file content and include this as a reference in the PDF documentation.

<h3>Docstrings instead of comments</h3> In order to make things easier for yourself, you probably want to make use of docstrings rather than comments: <blockquote> A docstring is a string literal that occurs as the first statement in a module, function, class, or method definition. Such a docstring becomes the <code>__doc__</code> special attribute of that object. </blockquote> This way, you can make use of the <code>__doc__</code> attribute when parsing the scripts when generating documentation. The three double quoted string placed immediately after the function/module definition that becomes the docstring is just syntactic sugaring. You can edit the <code>__doc__</code> attribute programmatically as needed. For instance, you can make use of decorators to make the creation of docstrings nicer in your specific case. For instance, to let you comment the steps inline, but still adding the comments to the docstring (programmed in browser, probably with errors): <pre class="prettyprint"><code>def with_steps(func): def add_step(n, doc): func.__doc__ = func.__doc__ + "\nStep %d: %s" % (n, doc) func.add_step = add_step @with_steps def _run(self, parameters): """Initial description that is turned into the initial docstring""" _run.add_step(1, "we start by doing this") code to do it _run.add_step(2, "then we do this") code to do it code </code></pre> Which would create a docstring like this: <blockquote> Initial description that is turned into the initial docstring Step 1: we start by doing this Step 2: then we do this </blockquote> You get the idea. <h3>Generating PDF from documented scripts</h3> Sphinx Personally, I'd just try the PDF-builders available for Sphinx, via the bundled LaTeXBuilder or using rinoh if you don't want to depend on LaTeX. However, you would have to use a docstring format that Sphinx understands, such as reStructuredText or Google Style Docstrings. AST An alternative is to use ast to extract the docstrings. This is probably what the Sphinx autodoc extension uses internally to extract the documentation from the source files. There are a few examples out there on how to do this, like this gist or this blog post. This way you can write a script that parses and outputs any formats you want. For instance, you can output Markdown or reST and convert it to PDF using pandoc. You could write marked up text directly in the docstrings, which would give you a lot of flexibility. Let's say you wanted to write your documentation using markdown – just write markdown directly in your docstring. <pre class="prettyprint"><code>def _run(self, parameters): """Example script ================ This script does a, b, c 1. Does something first 2. Does something else next 3. Returns something else Usage example: result = script(parameters) foo = [r.foo for r in results] """ </code></pre> This string can be extracted using ast and parsed/processed using whatever library you see fit.

Comments are not suitable for documentation, typically they are used to highlight specific aspects which are relevant to developers (not users) only. To achieve your goal, you can use <code>__doc__</code> strings in various places: <ul> <li>module-level</li> <li>class-level</li> <li>function-/method-level</li> </ul> In case your <code>_run</code> method is really long and you feel the doc-string is too far apart from the actual code then this is a strong sign that your function is too long anyway. It should be split into multiple smaller functions to improve clarity, each of which can have its doc-string. For example the Google style guide suggests that if a function exceeds 40 lines of code, it should be broken into smaller pieces. Then you can use for example Sphinx to parse that documentation and convert if to PDF format. Here's an example setup (using Google doc style): <pre class="prettyprint lang-py prettyprint-override"><code># -*- coding: utf-8 -*- """ Small description and information. @author: Author Attributes: CONSTANT_1 (int): Some description. CONSTANT_2 (int): Some description. """ import numpy as np import math from scipy import signal CONSTANT_1 = 5 CONSTANT_2 = 10 class Test(): """Main class.""" def __init__(self, run_id, parameters): """Some stuff not too important.""" pass def _run(self, parameters): """Main program returning a result object. Uses `func1` to compute X and then `func2` to convert it to Y. Args: parameters (dict): Parameters for the computation Returns: result """ X = self.func1(parameters) Y = self.func2(X) return Y def func1(self, p): """Information on this method.""" pass def func2(self, x): """Information on this method.""" pass </code></pre> Then with Sphinx you can use the <code>sphinx-quickstart</code> command line utility to set up a sample project. In order to create documentation for the script you can use <code>sphinx-apidoc</code>. For that purpose you can create a separate directory <code>scripts</code>, add an empty <code>__init__.py</code> file and place all your scripts inside that directory. After running these steps the directory structure will look like the following (assuming you didn't separate build and source directories during <code>sphinx-quickstart</code> (which is the default)): <pre class="prettyprint"><code>$ tree . ├── _build ├── conf.py ├── index.rst ├── make.bat ├── Makefile ├── scripts │ └── __init__.py │ └── example.py ├── _static └── _templates </code></pre> For <code>sphinx-apidoc</code> to work, you need to enable the <code>sphinx-autodoc</code> extension. Depending on the doc-style you use, you might also need to enable a corresponding extension. The above example is using Google doc style, which is handled by the Napoleon extension. These extensions can be enabled in <code>conf.py</code>: <pre class="prettyprint lang-py prettyprint-override"><code>extensions = ['sphinx.ext.autodoc', 'sphinx.ext.napoleon'] </code></pre> Then you can run <code>sphinx-apidoc</code> as follows (<code>-e</code> puts every module/script on a separate page, <code>-f</code> overwrites existing doc files, <code>-P</code> documents private members (those starting with <code>_</code>)): <pre class="prettyprint"><code>$ sphinx-apidoc -efPo api scripts/ Creating file api/scripts.rst. Creating file api/scripts.example.rst. Creating file api/modules.rst. </code></pre> This command created the necessary instructions for the actual build command. In order for the build too to be able to import and correctly document your scripts, you also need to set the import path accordingly. This can be done by uncommenting the following three lines near the top in <code>conf.py</code>: <pre class="prettyprint lang-py prettyprint-override"><code>import os import sys sys.path.insert(0, os.path.abspath('.')) </code></pre> To make your scripts' docs appear in the documentation you need to link them from within the main <code>index.rst</code> file: <pre class="prettyprint"><code>Welcome to ExampleProject's documentation! ========================================== .. toctree:: :maxdepth: 2 :caption: Contents: api/modules </code></pre> Eventually you can run the build command: <pre class="prettyprint"><code>$ make latexpdf </code></pre> Then the resulting documentation can be found at <code>_build/latex/<your-project-name>.pdf</code>. This is a screenshot of the resulting documentation: <img src="https://i.stack.imgur.com/xj6GI.png" alt="Example APIdoc"> Note that there are various themes available to change the look of your documentation. Sphinx also supports plenty of configuration options to customize the build of your documentation.

Documenting and detailing a single script based on the comments inside

Tags:

python

documentation

I am going to write a set of scripts, each independent from the others but with some similarities. The structure will most likely be the same for all the scripts and probably looks like:

# -*- coding: utf-8 -*-
"""
Small description and information
@author: Author
"""

# Imports
import numpy as np
import math
from scipy import signal
...

# Constant definition (always with variable in capital letters)
CONSTANT_1 = 5
CONSTANT_2 = 10

# Main class
class Test():
    def __init__(self, run_id, parameters):
        # Some stuff not too important
        
    def _run(self, parameters):
        # Main program returning a result object.

For each script, I would like to write documentation and export it in PDF. I need a library/module/parser which reads the scripts, extracts the noted comment, code and puts it back together in the desired output format.

For instance, in the _run() method, there might be several steps detailed in the comments:

def _run(self, parameters):
        # Step 1: we start by doing this
        code to do it
            
        # Step 2: then we do this
        code to do it
        code 
        code # this code does that

Which library/parser could I use to analyze the python script and output a PDF? At first, I was thinking of sphinx, but it is not suited to my need as I would have to design a custom extension. Moreover, sphinx strength lies in the links and hierarchy between multiple scripts of a same or of different modules. In my case, I will only be documenting one script, one file at a time.

Then, my second idea is to use the RST format and RST2PDF to create the PDF. For the parser, I could then design a parser which reads the .py file and extract the commented/decorated lines or set of lines as proposed below, and then write the RST file.

#-description
## Title of something
# doing this here
#-

#-code
some code to extract and put in the doc
some more code
#-

Finally, I would also like to be able to execute some code and catch the result in order to put it in the output PDF file. For instance, I could run a python code to compute the SHA1 hash of the .py file content and include this as a reference in the PDF documentation.

858

asked Jul 13 '20 13:07

Mathieu

2 Answers

Docstrings instead of comments

In order to make things easier for yourself, you probably want to make use of docstrings rather than comments:

A docstring is a string literal that occurs as the first statement in a module, function, class, or method definition. Such a docstring becomes the __doc__ special attribute of that object.

This way, you can make use of the __doc__ attribute when parsing the scripts when generating documentation.

The three double quoted string placed immediately after the function/module definition that becomes the docstring is just syntactic sugaring. You can edit the __doc__ attribute programmatically as needed.

For instance, you can make use of decorators to make the creation of docstrings nicer in your specific case. For instance, to let you comment the steps inline, but still adding the comments to the docstring (programmed in browser, probably with errors):

def with_steps(func):
  def add_step(n, doc):
    func.__doc__ = func.__doc__ + "\nStep %d: %s" % (n, doc)
  func.add_step = add_step

@with_steps
def _run(self, parameters):
  """Initial description that is turned into the initial docstring"""
  _run.add_step(1, "we start by doing this")
  code to do it
        
  _run.add_step(2, "then we do this")
  code to do it
  code

Which would create a docstring like this:

Initial description that is turned into the initial docstring
Step 1: we start by doing this
Step 2: then we do this

You get the idea.

Generating PDF from documented scripts

Sphinx

Personally, I'd just try the PDF-builders available for Sphinx, via the bundled LaTeXBuilder or using rinoh if you don't want to depend on LaTeX.

However, you would have to use a docstring format that Sphinx understands, such as reStructuredText or Google Style Docstrings.

AST

An alternative is to use ast to extract the docstrings. This is probably what the Sphinx autodoc extension uses internally to extract the documentation from the source files. There are a few examples out there on how to do this, like this gist or this blog post.

This way you can write a script that parses and outputs any formats you want. For instance, you can output Markdown or reST and convert it to PDF using pandoc.

You could write marked up text directly in the docstrings, which would give you a lot of flexibility. Let's say you wanted to write your documentation using markdown – just write markdown directly in your docstring.

def _run(self, parameters):
  """Example script
  ================

  This script does a, b, c

  1. Does something first
  2. Does something else next
  3. Returns something else

  Usage example:
  
      result = script(parameters)
      foo = [r.foo for r in results]
  """

This string can be extracted using ast and parsed/processed using whatever library you see fit.

answered Nov 14 '22 22:11

Henrik

Comments are not suitable for documentation, typically they are used to highlight specific aspects which are relevant to developers (not users) only. To achieve your goal, you can use __doc__ strings in various places:

module-level
class-level
function-/method-level

In case your _run method is really long and you feel the doc-string is too far apart from the actual code then this is a strong sign that your function is too long anyway. It should be split into multiple smaller functions to improve clarity, each of which can have its doc-string. For example the Google style guide suggests that if a function exceeds 40 lines of code, it should be broken into smaller pieces.

Then you can use for example Sphinx to parse that documentation and convert if to PDF format.

Here's an example setup (using Google doc style):

# -*- coding: utf-8 -*-
"""
Small description and information.
@author: Author

Attributes:
    CONSTANT_1 (int): Some description.
    CONSTANT_2 (int): Some description.
"""

import numpy as np
import math
from scipy import signal


CONSTANT_1 = 5
CONSTANT_2 = 10


class Test():
    """Main class."""
    def __init__(self, run_id, parameters):
        """Some stuff not too important."""
        pass
        
    def _run(self, parameters):
        """Main program returning a result object.

        Uses `func1` to compute X and then `func2` to convert it to Y.

        Args:
            parameters (dict): Parameters for the computation

        Returns:
            result
        """
        X = self.func1(parameters)
        Y = self.func2(X)
        return Y

    def func1(self, p):
        """Information on this method."""
        pass

    def func2(self, x):
        """Information on this method."""
        pass

Then with Sphinx you can use the sphinx-quickstart command line utility to set up a sample project. In order to create documentation for the script you can use sphinx-apidoc. For that purpose you can create a separate directory scripts, add an empty __init__.py file and place all your scripts inside that directory. After running these steps the directory structure will look like the following (assuming you didn't separate build and source directories during sphinx-quickstart (which is the default)):

$ tree
.
├── _build
├── conf.py
├── index.rst
├── make.bat
├── Makefile
├── scripts
│   └── __init__.py
│   └── example.py
├── _static
└── _templates

For sphinx-apidoc to work, you need to enable the sphinx-autodoc extension. Depending on the doc-style you use, you might also need to enable a corresponding extension. The above example is using Google doc style, which is handled by the Napoleon extension. These extensions can be enabled in conf.py:

extensions = ['sphinx.ext.autodoc', 'sphinx.ext.napoleon']

Then you can run sphinx-apidoc as follows (-e puts every module/script on a separate page, -f overwrites existing doc files, -P documents private members (those starting with _)):

$ sphinx-apidoc -efPo api scripts/
Creating file api/scripts.rst.
Creating file api/scripts.example.rst.
Creating file api/modules.rst.

This command created the necessary instructions for the actual build command. In order for the build too to be able to import and correctly document your scripts, you also need to set the import path accordingly. This can be done by uncommenting the following three lines near the top in conf.py:

import os
import sys
sys.path.insert(0, os.path.abspath('.'))

To make your scripts' docs appear in the documentation you need to link them from within the main index.rst file:

Welcome to ExampleProject's documentation!
==========================================

.. toctree::
   :maxdepth: 2
   :caption: Contents:

   api/modules

Eventually you can run the build command:

$ make latexpdf

Then the resulting documentation can be found at _build/latex/<your-project-name>.pdf.

This is a screenshot of the resulting documentation:

Example APIdoc

Note that there are various themes available to change the look of your documentation. Sphinx also supports plenty of configuration options to customize the build of your documentation.

answered Nov 14 '22 23:11

a_guest

Related questions
                            
                                How to read simple text from a PDF file with Python?
                            
                                Select columns in a DataFrame conditional on row
                            
                                Transaction atomic needed for bulk create?
                            
                                ffmpeg delay in decoding h264
                            
                                Calculating pairwise Euclidean distance between all the rows of a dataframe
                            
                                How to get the OpenCV image from Python and use it in C++ in pybind11?
                            
                                Pipenv Install RuntimeError: location not created nor specified
                            
                                Plotly: How to plot rectangle with gradient color in Plotly?
                            
                                Is it worth caching Python's range(start, stop, step)? [duplicate]
                            
                                What is the proper way to override threading.excepthook in Python?
                            
                                No menu for adding WSL python interpreter in PyCharm
                            
                                Dataclass not inheriting __eq__() method from its parent
                            
                                Unique elements of multiple sets
                            
                                How to connect R conda env to jupyter notebook
                            
                                How to solve "type is partially unknown" warning from pyright?
                            
                                Parsing Pandas Series From Another Series
                            
                                How Could I Make A Basic Car Physics In Pygame?
                            
                                How to write an app layout in Dash such that two graphs are side by side?
                            
                                What is Right extension for Plotly in JupyterLab?
                            
                                Confused why after 2nd evaluation of += operator of immutable string does not change the id in Python3 [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With