Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

side effect gotchas in python/numpy? horror stories and narrow escapes wanted

I am considering moving from Matlab to Python/numpy for data analysis and numerical simulations. I have used Matlab (and SML-NJ) for years, and am very comfortable in the functional environment without side effects (barring I/O), but am a little reluctant about the side effects in Python. Can people share their favorite gotchas regarding side effects, and if possible, how they got around them? As an example, I was a bit surprised when I tried the following code in Python:

lofls = [[]] * 4    #an accident waiting to happen!
lofls[0].append(7)  #not what I was expecting...
print lofls         #gives [[7], [7], [7], [7]]
#instead, I should have done this (I think)
lofls = [[] for x in range(4)]
lofls[0].append(7)  #only appends to the first list
print lofls         #gives [[7], [], [], []]

thanks in advance

like image 264
shabbychef Avatar asked Mar 09 '10 18:03

shabbychef


1 Answers

Confusing references to the same (mutable) object with references to separate objects is indeed a "gotcha" (suffered by all non-functional languages, ones which have mutable objects and, of course, references). A frequently seen bug in beginners' Python code is misusing a default value which is mutable, e.g.:

def addone(item, alist=[]):
  alist.append(item)
  return alist

This code may be correct if the purpose is to have addone keep its own state (and return the one growing list to successive callers), much as static data would work in C; it's not correct if the coder is wrongly assuming that a new empty list will be made at each call.

Raw beginners used to functional languages can also be confused by the command-query separation design decision in Python's built-in containers: mutating methods that don't have anything in particular to return (i.e., the vast majority of mutating methods) return nothing (specifically, they return None) -- they're doing all their work "in-place". Bugs coming from misunderstanding this are easy to spot, e.g.

alist = alist.append(item)

is pretty much guaranteed to be a bug -- it appends an item to the list referred to by name alist, but then rebinds name alist to None (the return value of the append call).

While the first issue I mentioned is about an early-binding that may mislead people who think the binding is, instead, a late one, there are issues that go the other way, where some people's expectations are for an early binding while the binding is, instead, late. For example (with a hypothetical GUI framework...):

for i in range(10):
    Button(text="Button #%s" % i,
           click=lambda: say("I'm #%s!" % i))

this will show ten buttons saying "Button #0", "Button #1", etc, but, when clicked, each and every one of them will say it's #9 -- because the i within the lambda is late bound (with a lexical closure). A fix is to take advantage of the fact that default values for argument are early-bound (as I pointed out about the first issue!-) and change the last line to

           click=lambda i=i: say("I'm #%s!" % i))

Now lambda's i is an argument with a default value, not a free variable (looked up by lexical closure) any more, and so the code works as intended (there are other ways too, of course).

like image 185
Alex Martelli Avatar answered Oct 01 '22 20:10

Alex Martelli