Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling argparse escaped character as option

The argparse library handles escaped characters (like \t to tab and \n to newline) differently than I prefer. An answer to this question gives a solution but I would like to make it less visible to the user.

Given the program:

#!/usr/bin/env python3
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('-d', '--delimiter', default='\t')
args = parser.parse_args()
print(args)

You will receive this output:

bash$ parser.py -d \t
Namespace(delimiter='t')

bash$ parser.py -d \\t
Namespace(delimiter='\\t')

bash$ parser.py -d '\t'
Namespace(delimiter='\\t')

bash$ parser.py -d '\\t'
Namespace(delimiter='\\\\t')

bash$ parser.py -d "\t"
Namespace(delimiter='\\t')

bash$ parser.py -d "\\t"
Namespace(delimiter='\\t')

bash$ parser.py -d $'\t'
Namespace(delimiter='\t')

bash$ parser.py -d $'\\t'
Namespace(delimiter='\\t')

bash$ parser.py -d $"\t"
Namespace(delimiter='$\\t')

bash$ parser.py -d $"\\t"
Namespace(delimiter='$\\t')

I get the desired argument only with

parser.py -d $'\t'

but I would prefer the input to look something like

parser.py -d \t 

or less preferably

parser.py -d '\t'
parser.py -d "\t"

If I want to change the behavior, is this something I can do using the argparse library? If not, is it possible for me to write the behavior on top of the existing argparse library? If not, is this just the way that bash passes arguments to argparse therefore out of my hands? If that is true, is this something that is usually documented to users or is this behavior assumed to be normal?

like image 871
dvinesett Avatar asked Dec 08 '15 00:12

dvinesett


People also ask

How do you make Argparse argument optional?

Optional Arguments To add an optional argument, simply omit the required parameter in add_argument() .

What does Metavar mean?

Metavar: It provides a different name for optional argument in help messages. Provide a value for the metavar keyword argument within add_argument() .

What is action Store_true in Argparse?

The store_true option automatically creates a default value of False. Likewise, store_false will default to True when the command-line argument is not present. The source for this behavior is succinct and clear: http://hg.python.org/cpython/file/2.7/Lib/argparse.py#l861.

What does Argparse ArgumentParser ()?

The ArgumentParser.parse_args() method runs the parser and places the extracted data in a argparse.Namespace object: args = parser. parse_args() print(args.


3 Answers

Assuming that the question was partially about how to carry out the post-processing explained by @hpaulj and since I couldn't see an immediate solution for Python 3 in the links above, here is a quick solution:

import codecs

def unescaped_str(arg_str):
    return codecs.decode(str(arg_str), 'unicode_escape')

then in the parser:

parser.add_argument('-d', '--delimiter', type=unescaped_str, default='\t')

This will make your less desirable cases work:

parser.py -d '\t'
parser.py -d "\t"

But not the desired unescaped \t. In any case, this solution can be dangerous as there is no check mechanism...

like image 60
dojuba Avatar answered Sep 19 '22 13:09

dojuba


The string that you see in the namespace is exactly the string that appears in sys.argv - which was created by bash and the interpreter. The parser does not process or tweak this string. It just sets the value in the namespace. You can verify this by print sys.argv before parsing.

If it is clear to you what the user wants, then I'd suggest modifying args.delimiter after parsing. The primary purpose of the parser is to figure out what the user wants. You, as programmer, can interpert and apply that information in any way.

Once you've worked out a satisfactory post-parsing function, you could implement it as a type for this argument (like what int() and float() do for numerical strings). But focus on the post-parsing processing.

like image 32
hpaulj Avatar answered Sep 19 '22 13:09

hpaulj


Here's a quick way to handle the quoted ('\t' and "\t") input cases correctly (although it only specifically handles your specific tab case input):

parser.add_argument('-d', '--delimiter', type=lambda d: '\t' if d == '\\t' else d)

First note the following: in Python, "\t" is the tab literal, and the escaped "\\t" is a two-character string (the first character is "\", and the second is "t"). You can check this with len("\t"), len("\\t"), which gives 1, 2.

When the user gives the quoted -d '\t' on the command line, python will receive the string '\t' (literally the two characters, "backslash" and "t"). We want to replace this two-character string with a single "tab" character. The type argument takes a function as a way of pre-processing an argument. The lambda function checks for the two-character string and replaces it with the tab character.

like image 43
Mike Trotta Avatar answered Sep 21 '22 13:09

Mike Trotta