Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Clojure: recur vs. recursion via fn name

I'm just a beginner on Clojure, and I've been trying the 4clojure.com problems. There I stumbled upon a problem in an exercise where I am supposed to write a flatten implementation.

I basically understand the concept of tail call optimization, and how recur allows not consuming the stack, as opposed to "normal" recursion (I don't know if there's a proper term).

And that's why I don't get what is going on here:

(defn foo1 [x]
  (if (> x 0)
    (do (println x)
        (foo1 (dec x)))))

(defn foo2 [x]
  (if (> x 0)
    (do (println x)
        (recur (dec x)))))

As expected both foo1 and foo2 are the same functionally, but, given a parameter large enough (100000 in my case), I get a stack overflow™ on foo1 while foo2 completes normally.

Now, on to the flatten problem:

(defn flatten1 [ls]
  (mapcat
    #(if (coll? %)
      (flatten1 %)
      (list %))
    ls))

(defn flatten2 [ls]
  (mapcat
    #(if (coll? %)
      (recur %)
      (list %))
    ls))

Test case:

(flatten [1 [2] 3 [4 [5 6 [7] 8]]])
(flatten1 [1 [2] 3 [4 [5 6 [7] 8]]])
(flatten2 [1 [2] 3 [4 [5 6 [7] 8]]])

Expected result: '(1 2 3 4 5 6 7 8)

Well, flatten1 works ok (it's a small input anyway). But flatten2 just hangs indefinitely. Doesn't recur target the recursion point set at the defn? What's the difference (optimization aside) with recursing to the function by name?

like image 344
alepeino Avatar asked Oct 28 '25 08:10

alepeino


1 Answers

By modifying the program a bit you can see the problem:

(ns clj.core
  (:require [tupelo.core :as t] )
  (:gen-class))
(t/refer-tupelo)

(defn flatten1 [ls]
  (mapcat
    (fn [it]
      (println "f1: it=" it)
      (if (coll? it)
        (flatten1 it)
        (list it)))
    ls))

(defn flatten2 [ls]
  (mapcat
    (fn [it]
      (println "f2: it=" it)
      (if (coll? it)
        (recur it)
        (list it)))
    ls))

(defn -main
  [& args]
  (newline) (println "main - 1")
  (spyx (flatten  [1 [2] 3 [4 [5 6 [7] 8]]]))

  (newline) (println "main - 2")
  (spyx (flatten1 [1 [2] 3 [4 [5 6 [7] 8]]]))

  (newline) (println "main - 3")
  (flatten2 [1 [2] 3 [4 [5 6 [7] 8]]])

Running the code produces this output:

main - 1
(flatten [1 [2] 3 [4 [5 6 [7] 8]]]) => (1 2 3 4 5 6 7 8)

main - 2
f1: it= 1
f1: it= [2]
f1: it= 2
f1: it= 3
f1: it= [4 [5 6 [7] 8]]
f1: it= 4
f1: it= [5 6 [7] 8]
f1: it= 5
f1: it= 6
f1: it= [7]
f1: it= 7
f1: it= 8
(flatten1 [1 [2] 3 [4 [5 6 [7] 8]]]) => (1 2 3 4 5 6 7 8)

main - 3
f2: it= 1
f2: it= [2]
f2: it= [2]
f2: it= [2]
f2: it= [2]
f2: it= [2]
f2: it= [2]
f2: it= [2]
f2: it= [2]

So you can see it gets stuck on the [2] item, the 2nd element of the input list.

The reason this fails is that the recur statement only jumps back to the innermost function, which is the anonymous form #(if ...) in your original problem, of the form (fn [it] ...) in the 2nd version.

Note that recur can only "jump" to the innermost fn/loop target. You cannot use recur to jump out of your inner anonymous function to reach flatten2. Since it only jumps to the inner function, the 1-elem collection [2] does not replace the ls value at the end of the mapcat call, and you therefore get the infinite loop.

The best advice for any programming is "keep it simple". Recursion is simpler than loop/recur for most problems.

On the JVM, each stack frame requires some memory (consult the docs about the -Xs switch to increase). If you use too many stack frames, you will eventually run out of memory (controlled by the -Xmx switch). You should usually be able to count on at least 1000 stack frames being available (you can test if you like for your machine & params). So as a rule of thumb, if your recursion depth is 1000 or less, don't worry about using loop/recur.

like image 135
Alan Thompson Avatar answered Oct 30 '25 04:10

Alan Thompson



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!