Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why program functionally in Python?

At work we used to program our Python in a pretty standard OO way. Lately, a couple guys got on the functional bandwagon. And their code now contains lots more lambdas, maps and reduces. I understand that functional languages are good for concurrency but does programming Python functionally really help with concurrency? I am just trying to understand what I get if I start using more of Python's functional features.

like image 371
Paul Hildebrandt Avatar asked Dec 12 '09 04:12

Paul Hildebrandt


1 Answers

Edit: I've been taken to task in the comments (in part, it seems, by fanatics of FP in Python, but not exclusively) for not providing more explanations/examples, so, expanding the answer to supply some.

lambda, even more so map (and filter), and most especially reduce, are hardly ever the right tool for the job in Python, which is a strongly multi-paradigm language.

lambda main advantage (?) compared to the normal def statement is that it makes an anonymous function, while def gives the function a name -- and for that very dubious advantage you pay an enormous price (the function's body is limited to one expression, the resulting function object is not pickleable, the very lack of a name sometimes makes it much harder to understand a stack trace or otherwise debug a problem -- need I go on?!-).

Consider what's probably the single most idiotic idiom you sometimes see used in "Python" (Python with "scare quotes", because it's obviously not idiomatic Python -- it's a bad transliteration from idiomatic Scheme or the like, just like the more frequent overuse of OOP in Python is a bad transliteration from Java or the like):

inc = lambda x: x + 1 

by assigning the lambda to a name, this approach immediately throws away the above-mentioned "advantage" -- and doesn't lose any of the DISadvantages! For example, inc doesn't know its name -- inc.__name__ is the useless string '<lambda>' -- good luck understanding a stack trace with a few of these;-). The proper Python way to achieve the desired semantics in this simple case is, of course:

def inc(x): return x + 1 

Now inc.__name__ is the string 'inc', as it clearly should be, and the object is pickleable -- the semantics are otherwise identical (in this simple case where the desired functionality fits comfortably in a simple expression -- def also makes it trivially easy to refactor if you need to temporarily or permanently insert statements such as print or raise, of course).

lambda is (part of) an expression while def is (part of) a statement -- that's the one bit of syntax sugar that makes people use lambda sometimes. Many FP enthusiasts (just as many OOP and procedural fans) dislike Python's reasonably strong distinction between expressions and statements (part of a general stance towards Command-Query Separation). Me, I think that when you use a language you're best off using it "with the grain" -- the way it was designed to be used -- rather than fighting against it; so I program Python in a Pythonic way, Scheme in a Schematic (;-) way, Fortran in a Fortesque (?) way, and so on:-).

Moving on to reduce -- one comment claims that reduce is the best way to compute the product of a list. Oh, really? Let's see...:

$ python -mtimeit -s'L=range(12,52)' 'reduce(lambda x,y: x*y, L, 1)' 100000 loops, best of 3: 18.3 usec per loop $ python -mtimeit -s'L=range(12,52)' 'p=1' 'for x in L: p*=x' 100000 loops, best of 3: 10.5 usec per loop 

so the simple, elementary, trivial loop is about twice as fast (as well as more concise) than the "best way" to perform the task?-) I guess the advantages of speed and conciseness must therefore make the trivial loop the "bestest" way, right?-)

By further sacrificing compactness and readability...:

$ python -mtimeit -s'import operator; L=range(12,52)' 'reduce(operator.mul, L, 1)' 100000 loops, best of 3: 10.7 usec per loop 

...we can get almost back to the easily obtained performance of the simplest and most obvious, compact, and readable approach (the simple, elementary, trivial loop). This points out another problem with lambda, actually: performance! For sufficiently simple operations, such as multiplication, the overhead of a function call is quite significant compared to the actual operation being performed -- reduce (and map and filter) often forces you to insert such a function call where simple loops, list comprehensions, and generator expressions, allow the readability, compactness, and speed of in-line operations.

Perhaps even worse than the above-berated "assign a lambda to a name" anti-idiom is actually the following anti-idiom, e.g. to sort a list of strings by their lengths:

thelist.sort(key=lambda s: len(s)) 

instead of the obvious, readable, compact, speedier

thelist.sort(key=len) 

Here, the use of lambda is doing nothing but inserting a level of indirection -- with no good effect whatsoever, and plenty of bad ones.

The motivation for using lambda is often to allow the use of map and filter instead of a vastly preferable loop or list comprehension that would let you do plain, normal computations in line; you still pay that "level of indirection", of course. It's not Pythonic to have to wonder "should I use a listcomp or a map here": just always use listcomps, when both appear applicable and you don't know which one to choose, on the basis of "there should be one, and preferably only one, obvious way to do something". You'll often write listcomps that could not be sensibly translated to a map (nested loops, if clauses, etc), while there's no call to map that can't be sensibly rewritten as a listcomp.

Perfectly proper functional approaches in Python often include list comprehensions, generator expressions, itertools, higher-order functions, first-order functions in various guises, closures, generators (and occasionally other kinds of iterators).

itertools, as a commenter pointed out, does include imap and ifilter: the difference is that, like all of itertools, these are stream-based (like map and filter builtins in Python 3, but differently from those builtins in Python 2). itertools offers a set of building blocks that compose well with each other, and splendid performance: especially if you find yourself potentially dealing with very long (or even unbounded!-) sequences, you owe it to yourself to become familiar with itertools -- their whole chapter in the docs makes for good reading, and the recipes in particular are quite instructive.

Writing your own higher-order functions is often useful, especially when they're suitable for use as decorators (both function decorators, as explained in that part of the docs, and class decorators, introduced in Python 2.6). Do remember to use functools.wraps on your function decorators (to keep the metadata of the function getting wrapped)!

So, summarizing...: anything you can code with lambda, map, and filter, you can code (more often than not advantageously) with def (named functions) and listcomps -- and usually moving up one notch to generators, generator expressions, or itertools, is even better. reduce meets the legal definition of "attractive nuisance"...: it's hardly ever the right tool for the job (that's why it's not a built-in any more in Python 3, at long last!-).

like image 173
Alex Martelli Avatar answered Oct 05 '22 17:10

Alex Martelli