Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Python populate a string from argparse

I'm curious about what goes on behind the scenes when using argparse. I've checked here and here, as apparently Namespace presently only exists in the argparse library.

It's possible I'm using the wrong keywords to search SO/Google. It's also possible that I'm asking a senseless or obvious question, but here we go.

When capturing a string of input in Python via argparse as such:

>python palindrome.py 'Taco cat!?'

When running the code below, I would expect that by specifying parser.add_argument('string'... the resulting Namespace acts as a buffer for the single input string.

The next line where I assign the string to "args" must be the first time we actually parse the input, incurring a workload proportionate to the length of the input string. At this point, "args" actually contains a Namespace object, which cannot be parsed via for loop or otherwise (that I know of).

Finally, in order to parse the input using "for" or some other loop, I use the Namespace object to populate a string. I'm curious how many times this process incurs a compute time proportionate to the original string length?

Which of these steps copy by address or copy by value behind the scenes? Looks like the optimistic worse case would be 2x. Once to create the Namespace object, then once again to assign its content to "arg_str"

#! /usr/bin/env python
import sys
import argparse

parser = argparse.ArgumentParser(description='Enter string to see if it\'s a palindrome.')
parser.add_argument('string', help="string to be tested for palindromedness..ishness")
args = parser.parse_args()

arg_str = args.string

# I can parse by using 'for i in arg_str:' but if I try to parse 'for i in args:'
# I get TypeError: "Namespace' object is not iterable

Thanks for looking!!

like image 806
hitjim Avatar asked Nov 19 '13 23:11

hitjim


People also ask

What does Argparse return?

Later, calling parse_args() will return an object with two attributes, integers and accumulate . The integers attribute will be a list of one or more ints, and the accumulate attribute will be either the sum() function, if --sum was specified at the command line, or the max() function if it was not.

How do you pass arguments to Argparse?

After importing the library, argparse. ArgumentParser() initializes the parser so that you can start to add custom arguments. To add your arguments, use parser. add_argument() .

What does import Argparse mean?

argparse — parse the arguments. Using argparse is how you let the user of your program provide values for variables at runtime. It's a means of communication between the writer of a program and the user.


2 Answers

The operating system (or the shell) first parses the command line, passing the strings to the Python interpreter, where they are accessible to you as the sys.argv array.

python palindrome.py 'Taco cat!?'

becomes

['palindrome.py', 'Taco cat!?']

parser.parse_args() processes those strings, generally by just passing references around. When a basic argument is 'parsed', that string is 'stored' in the Namespace with setattr(Namespace, dest, value), which in your example would be the equivalent to setattr(namespace, 'string', sys.argv[1]).

There is nothing special about argparse.Namespace. It is a simple subclass of Object. The arguments are simple object attributes. argparse uses setattr and getattr to access these, though users can normally use the dot format (args.string). It does not do any special string handling. That is entirely Python's responsibility.

The Namespace is not an iterable, that is, it is not a list or tuple or anything like that. It is an object. The namespace can be converted to a dictionary with vars(args) (that's in the argparse documentation). So you could iterate over that dictionary using keys and items.

One other thing. Don't test your arg.string with is. Use == or in [] to compare it to other strings. That is because a string that is created via sys.argv does not have the same id as one created via x = 'test'. To get an idea why, try:

argv = 'one two three'.split()
print argv[0]=='one' # true
print argv[0] is 'one'  # false
print argv[0] in ['one', 'two','three'] # true
x = 'one'
print x is 'one' # true
print id(x)
print id('one')
print id(argv[0])

Where possible Python does keep unique copies of strings. but if the strings are generated in different ways, they will have different ids, and not satisfy the is test.

like image 107
hpaulj Avatar answered Oct 01 '22 01:10

hpaulj


Python assignment never makes copies of things. Variables are names for objects; assigning an object to a variable gives the object a new name (taking the name from whatever had it before). In low-level terms, it amounts to a pointer copy and a few refcount operations.

No part of this code requires a copy to be made of the input string. If it did, though, it wouldn't matter. Command-line arguments can't (and shouldn't) get long enough for the time you might spend copying to be significant compared to the rest of your runtime.

like image 24
user2357112 supports Monica Avatar answered Oct 01 '22 01:10

user2357112 supports Monica