The argparse library handles escaped characters (like \t to tab and \n to newline) differently than I prefer. An answer to this question gives a solution but I would like to make it less visible to the user.
Given the program:
#!/usr/bin/env python3
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('-d', '--delimiter', default='\t')
args = parser.parse_args()
print(args)
You will receive this output:
bash$ parser.py -d \t
Namespace(delimiter='t')
bash$ parser.py -d \\t
Namespace(delimiter='\\t')
bash$ parser.py -d '\t'
Namespace(delimiter='\\t')
bash$ parser.py -d '\\t'
Namespace(delimiter='\\\\t')
bash$ parser.py -d "\t"
Namespace(delimiter='\\t')
bash$ parser.py -d "\\t"
Namespace(delimiter='\\t')
bash$ parser.py -d $'\t'
Namespace(delimiter='\t')
bash$ parser.py -d $'\\t'
Namespace(delimiter='\\t')
bash$ parser.py -d $"\t"
Namespace(delimiter='$\\t')
bash$ parser.py -d $"\\t"
Namespace(delimiter='$\\t')
I get the desired argument only with
parser.py -d $'\t'
but I would prefer the input to look something like
parser.py -d \t
or less preferably
parser.py -d '\t'
parser.py -d "\t"
If I want to change the behavior, is this something I can do using the argparse library? If not, is it possible for me to write the behavior on top of the existing argparse library? If not, is this just the way that bash passes arguments to argparse therefore out of my hands? If that is true, is this something that is usually documented to users or is this behavior assumed to be normal?
Optional Arguments To add an optional argument, simply omit the required parameter in add_argument() .
Metavar: It provides a different name for optional argument in help messages. Provide a value for the metavar keyword argument within add_argument() .
The store_true option automatically creates a default value of False. Likewise, store_false will default to True when the command-line argument is not present. The source for this behavior is succinct and clear: http://hg.python.org/cpython/file/2.7/Lib/argparse.py#l861.
The ArgumentParser.parse_args() method runs the parser and places the extracted data in a argparse.Namespace object: args = parser. parse_args() print(args.
Assuming that the question was partially about how to carry out the post-processing explained by @hpaulj and since I couldn't see an immediate solution for Python 3 in the links above, here is a quick solution:
import codecs
def unescaped_str(arg_str):
return codecs.decode(str(arg_str), 'unicode_escape')
then in the parser:
parser.add_argument('-d', '--delimiter', type=unescaped_str, default='\t')
This will make your less desirable cases work:
parser.py -d '\t'
parser.py -d "\t"
But not the desired unescaped \t
. In any case, this solution can be dangerous as there is no check mechanism...
The string that you see in the namespace
is exactly the string that appears in sys.argv
- which was created by bash
and the interpreter. The parser
does not process or tweak this string. It just sets the value in the namespace
. You can verify this by print sys.argv
before parsing.
If it is clear to you what the user wants, then I'd suggest modifying args.delimiter
after parsing. The primary purpose of the parser is to figure out what the user wants. You, as programmer, can interpert and apply that information in any way.
Once you've worked out a satisfactory post-parsing function, you could implement it as a type
for this argument (like what int()
and float()
do for numerical strings). But focus on the post-parsing processing.
Here's a quick way to handle the quoted ('\t'
and "\t"
) input cases correctly (although it only specifically handles your specific tab case input):
parser.add_argument('-d', '--delimiter', type=lambda d: '\t' if d == '\\t' else d)
First note the following: in Python, "\t"
is the tab literal, and the escaped "\\t"
is a two-character string (the first character is "\"
, and the second is "t"
). You can check this with len("\t"), len("\\t")
, which gives 1, 2
.
When the user gives the quoted -d '\t'
on the command line, python will receive the string '\t'
(literally the two characters, "backslash" and "t"). We want to replace this two-character string with a single "tab" character. The type
argument takes a function as a way of pre-processing an argument. The lambda function checks for the two-character string and replaces it with the tab character.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With