I'm curious about what goes on behind the scenes when using argparse. I've checked here and here, as apparently Namespace presently only exists in the argparse library.
It's possible I'm using the wrong keywords to search SO/Google. It's also possible that I'm asking a senseless or obvious question, but here we go.
When capturing a string of input in Python via argparse as such:
>python palindrome.py 'Taco cat!?'
When running the code below, I would expect that by specifying parser.add_argument('string'... the resulting Namespace acts as a buffer for the single input string.
The next line where I assign the string to "args" must be the first time we actually parse the input, incurring a workload proportionate to the length of the input string. At this point, "args" actually contains a Namespace object, which cannot be parsed via for loop or otherwise (that I know of).
Finally, in order to parse the input using "for" or some other loop, I use the Namespace object to populate a string. I'm curious how many times this process incurs a compute time proportionate to the original string length?
Which of these steps copy by address or copy by value behind the scenes? Looks like the optimistic worse case would be 2x. Once to create the Namespace object, then once again to assign its content to "arg_str"
#! /usr/bin/env python
import sys
import argparse
parser = argparse.ArgumentParser(description='Enter string to see if it\'s a palindrome.')
parser.add_argument('string', help="string to be tested for palindromedness..ishness")
args = parser.parse_args()
arg_str = args.string
# I can parse by using 'for i in arg_str:' but if I try to parse 'for i in args:'
# I get TypeError: "Namespace' object is not iterable
Thanks for looking!!
Later, calling parse_args() will return an object with two attributes, integers and accumulate . The integers attribute will be a list of one or more ints, and the accumulate attribute will be either the sum() function, if --sum was specified at the command line, or the max() function if it was not.
After importing the library, argparse. ArgumentParser() initializes the parser so that you can start to add custom arguments. To add your arguments, use parser. add_argument() .
argparse — parse the arguments. Using argparse is how you let the user of your program provide values for variables at runtime. It's a means of communication between the writer of a program and the user.
The operating system (or the shell) first parses the command line, passing the strings to the Python interpreter, where they are accessible to you as the sys.argv
array.
python palindrome.py 'Taco cat!?'
becomes
['palindrome.py', 'Taco cat!?']
parser.parse_args()
processes those strings, generally by just passing references around. When a basic argument is 'parsed', that string is 'stored' in the Namespace with setattr(Namespace, dest, value)
, which in your example would be the equivalent to setattr(namespace, 'string', sys.argv[1])
.
There is nothing special about argparse.Namespace
. It is a simple subclass of Object
. The arguments are simple object attributes. argparse
uses setattr
and getattr
to access these, though users can normally use the dot
format (args.string
). It does not do any special string handling. That is entirely Python's responsibility.
The Namespace is not an iterable, that is, it is not a list or tuple or anything like that. It is an object. The namespace can be converted to a dictionary with vars(args)
(that's in the argparse documentation). So you could iterate over that dictionary using keys
and items
.
One other thing. Don't test your arg.string
with is
. Use ==
or in []
to compare it to other strings. That is because a string that is created via sys.argv
does not have the same id
as one created via x = 'test'
. To get an idea why, try:
argv = 'one two three'.split()
print argv[0]=='one' # true
print argv[0] is 'one' # false
print argv[0] in ['one', 'two','three'] # true
x = 'one'
print x is 'one' # true
print id(x)
print id('one')
print id(argv[0])
Where possible Python does keep unique copies of strings. but if the strings are generated in different ways, they will have different ids, and not satisfy the is
test.
Python assignment never makes copies of things. Variables are names for objects; assigning an object to a variable gives the object a new name (taking the name from whatever had it before). In low-level terms, it amounts to a pointer copy and a few refcount operations.
No part of this code requires a copy to be made of the input string. If it did, though, it wouldn't matter. Command-line arguments can't (and shouldn't) get long enough for the time you might spend copying to be significant compared to the rest of your runtime.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With