I recently started learning Python and I was rather surprised to find a 1000 deep recursion limit (by default). If you set it high enough, about 30000, it crashes with a segmentation fault just like C. Although, C seems to go quite a lot higher.
(The Python folks are quick to point out that you can always convert recursive functions to iterative ones and that they're always faster. That's 100% true. It's not really what my question is about though.)
I tried the same experiment in Perl and somewhere around 10 million recursions it consumed all of my 4 gigs of ram and I used ^C to stop trying. Clearly Perl doesn't use the C stack, but it does use a ridiculous amount of memory when it recurses -- not terribly shocking considering how much work it has to do to call functions.
I tried in Pike and was completely surprised to get 100,000,000 recursions in about 2 seconds. I have no idea how it did that, but I suspect it flattened the recursion to an iterative process -- it doesn't seem to consume any extra memory while it does it. [Note: Pike does flatten trivial cases, but segfaults on more complicated ones, or so I'm told.]
I used these otherwise useless functions:
int f(int i, int l) { if(i<l) return f(i+1,l); return i; } sub f { return f($_[0]+1, $_[1]) if $_[0]<$_[1]; return $_[0] }; def f(i,l): if i<l: return f(i+1,l) return i
I'm very curious how other languages (e.g., PHP, Ruby, Java, Lua, Ocaml, Haskell) handle recursion and why they handle it that way. Additionally, please note whether it makes a difference if the function is "tail-recursive" (see comment).
(The Python folks are quick to point out that you can always convert recursive functions to iterative ones and that they're always faster. That's 100% true.
Python also accepts function recursion, which means a defined function can call itself. Recursion is a common mathematical and programming concept. It means that a function calls itself. This has the benefit of meaning that you can loop through data to reach a result.
The recursion depth limit in Python is by default 1000 . You can change it using sys. setrecursionlimit() function.
Recursion can be considered bad in Python when there is a more optimal way to implement the same algorithm using iteration or the recursive use case has the potential to generate more than 1000 function calls on the call stack.
"The Python folks are quick to point out that you can always convert recursive functions to iterative ones and that they're always faster"
This is true, but if it's really as easy as all that, why doesn't Python do it for me, so that my code can look as simple as possible? (I say this not to slam Python implementers, but because the answer explains the problem).
Recursion optimisations have been present in functional languages since, like, the 14th century or something. Haskell, CAML, Lisp implementations all typically convert at least tail recursive functions to iterations: you basically do this by spotting that it's possible, i.e. that the function can be rearranged such that no local variables other than the return value are used after the recursive call. One trick to make it possible if there's some work done to the recursed return value before return, is to introduce an additional "accumulator" parameter. In simple terms this means the work can effectively be done on the way "down" instead of on the way "up": search around for 'how to make a function tail-recursive' for details.
The actual details of turning a tail recursive function into a loop is basically to jigger with your call convention such you can "perform the call" simply by setting up the parameters and jumping back to the start of the function, without bothering to save all that stuff in scope that you know you won't ever use. In assembler terms, you don't have to preserve caller-saves registers if data-flow analysis tells you they're unused beyond the call, and the same goes for anything on the stack: you don't have to move the stack pointer on a call if you don't mind "your" bit of stack being scribbled over by the next recursion/iteration.
Contrary to how you paraphrased the Python folks, converting a general recursive function to an iteration is not trivial: for example if it's multiply-recursive then in a simple approach you'd still need a stack.
Memoization is a useful technique, though, for arbitrarily recursive functions, that you might like to look up if you're interested in the possible approaches. What it means is that each time a function is evaluated, you stick the result in a cache. To use this to optimize recursion, basically, if your recursive function counts "down", and you memoize it, then you can evaluate it iteratively by adding a loop that counts "up" calculating each value of the function in turn until you reach the target. This uses very little stack space provided that the memo cache is big enough to hold all the values you'll need: for instance if f(n) depends on f(n-1), f(n-2) and f(n-3) you only need space for 3 values in the cache: as you go up you can kick the ladder away. If f(n) depends on f(n-1) and f(n/2), you need lots of space in the cache, but still less than you'd use for stack in an unoptimised recursion.
This is more of an implementation question than a language question. There's nothing stopping some (stoopid) C compiler implementor from also limiting their call stack to 1000. There are a lot of small processors out there that wouldn't have stack space for even that many.
(The Python folks are quick to point out that you can always convert recursive functions to iterative ones and that they're always faster. That's 100% true. It's not really what my question is about though.)
Perhaps they do say that, but this isn't quite correct. Recursion can always be converted to iteration, but sometimes it also requires manual use of a stack too. In those circumstances, I could see the recursive version being faster (assuming you are smart enough to make simple optimizations, like pulling unneeded declarations outside of the recursive routine). After all, the stack pushes surrounding procedure calls are a well bounded problem that your compiler should know how to optimize very well. Manual stack operations, on the other hand, are not going to have specialized optimization code in your compiler, and are liable to have all sorts of user-interface sanity checks that will take up extra cycles.
It may be the case that the iterative/stack solution is always faster in Python. If so, that's a failing of Python, not of recursion.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With