Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating composable/hierarchical command-line parsers using Python/argparse

(A simplified form of the problem.) I'm writing an API involving some Python components. These might be functions, but for concreteness let's say they're objects. I want to be able to parse options for the various components from the command line.

from argparse import ArgumentParser

class Foo(object):
    def __init__(self, foo_options):
        """do stuff with options"""

    """..."""

class Bar(object):
    def __init__(sef, bar_options):
        """..."""

def foo_parser():
    """(could also be a Foo method)"""
    p = ArgumentParser()
    p.add_argument('--option1')
    #...
    return p

def bar_parser(): "..."

But now I want to be able to build larger components:

def larger_component(options):
    f1 = Foo(options.foo1)
    f2 = Foo(options.foo2)
    b  = Bar(options.bar)
    # ... do stuff with these pieces

Fine. But how to write the appropriate parser? We might wish for something like this:

def larger_parser(): # probably need to take some prefix/ns arguments
    # general options to be overridden by p1, p2
    # (this could be done automagically or by hand in `larger_component`):
    p  = foo_parser(prefix=None,          namespace='foo')
    p1 = foo_parser(prefix='first-foo-',  namespace='foo1')
    p2 = foo_parser(prefix='second-foo-', namespace='foo2')
    b  = bar_parser()
    # (you wouldn't actually specify the prefix/namespace twice: )
    return combine_parsers([(p1, namespace='foo1', prefix='first-foo-'),
                            (p2,...),p,b])

larger_component(larger_parser().parse_args())
# CLI should accept --foo1-option1, --foo2-option1, --option1  (*)

which looks a bit like argparse's parents feature if you forget that we want prefixing (so as to be able to add multiple parsers of the same type) and probably namespacing (so that we can build tree-structured namespaces to reflect the structure of the components).

Of course, we want larger_component and larger_parser to be composable in the same way, and the namespace object passed to a certain component should always have the same internal shape/naming structure.

The trouble seems to be that the argparse API is basically about mutating your parsers, but querying them is more difficult - if you turned a datatype into a parser directly, you could just walk these objects. I managed to hack something that somewhat works if the user writes a bunch of functions to add arguments to parsers by hand, but each add_argument call must then take a prefix, and the whole thing becomes quite inscrutable and probably non-composable. (You could abstract over this at the cost of duplicating some parts of the internal data structures ...). I also tried to subclass the parser and group objects ...

You could imagine this might be possible using a more algebraic CLI-parsing API, but I don't think rewriting argparse is a good solution here.

Is there a known/straightforward way to do this?

like image 427
Fixnum Avatar asked Oct 20 '22 00:10

Fixnum


1 Answers

Some thoughts that may help you construct the larger parser:

parser = argparse.ArgumentParser(...)
arg1 = parser.add_argument('--foo',...)

Now arg1 is a reference to the Action object created by add_argument. I'd suggest doing this in an interactive shell and looking at its attributes. Or at least print its repr. You can also experiment with modifying attributes. Most of what a parser 'knows' about the arguments is contained in these actions. In a sense a parser is an object that 'contains' a bunch of 'actions'.

Look also at:

parser._actions

This is the parser's master list of actions, which will include the default help as well as the ones you add.

The parents mechanism copies Action references from the parent to the child. Note, it does not make copies of the Action objects. It also recreates argument groups - but these groups only serve to group help lines. They have nothing to do with parsing.

args1, extras = parser.parse_known_args(argv, namespace)

is very useful when dealing with multiple parsers. With it, each parser can handle the arguments it knows about, and pass the rest on to others. Try to understand the inputs and outputs to that method.

We have talked about composite Namespace objects in earlier SO questions. The default argparse.Namespace class is a simple object class with a repr method. The parser just uses hasattr, getattr and setattr, trying to be as non-specific as it can. You could construct a more elaborate namespace class.

argparse subcommands with nested namespaces

You can also customize the Action classes. That's where most values are inserted into the Namespace (though defaults are set elsewhere).

IPython uses argparse, both for the main call, and internally for magic commands. It constructs many arguments from config files. Thus it is possible to set many values either with default configs, custom configs, or at the last moment via the commandline arguments.

like image 79
hpaulj Avatar answered Oct 22 '22 09:10

hpaulj