Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does Clojure's memoize force the evaluation of its arguments?

In Clojure if I memoize a function, name it f and call it on an argument a.

If a is a large lazy value, does memoize return a value based on matching the thunk as opposed to forcing the evaluation of a and matching on the result?

Where a thunk is the unevaluated part of the lazy sequence.

If this isn't the case is there a built-in way to get this behaviour?

Thanks!

like image 789
toofarsideways Avatar asked Feb 01 '12 00:02

toofarsideways


1 Answers

As stated by mikera, memoize doesn't handle infinite lazy seqs. I'm adding this answer to provide a short description of the implementation reasons for this (plus two ideas for identity-based memoization schemes, one simple, one more complex). (Edit: actually there is a simple general identity-based solution, see below.)

Why it doesn't work

memoize uses a hash map to store the mapping from arguments to values and hash maps use clojure.lang.Util/hasheq when determining if an object is one of their keys (except empty maps which just return false). Since hasheq's implementation for lazy seqs forces the entirety of the seq, asking any map if an infinite lazy seq is one of its keys will cause it to go into an infinite, memory-exhausting loop. Thus the same goes for memoize.

Strictly speaking, initially the map is an array map. (In Clojure, for reasons of efficiency, small maps are usually array maps; if enough items are assoc'd onto an array map, the return value becomes a hash map.) However non-empty array maps also fail to handle infinite lazy seqs due to a similar reason involving an equivalence-checking method.

A solution

System/identityHashCode returns whatever hashCode would return for a given objects if it used the default implementation (whether or not its hashCode is overridden).

(defprotocol PWrapped
  (-unwrap [this]))
PWrapped

(defn wrap [o]
  (reify
    Object
    (hashCode [_] (System/identityHashCode o))
    PWrapped
    (-unwrap [_] o)))

;;; adapted from clojure.core/memoize, (C) Rich Hickey, EPL-licenced
(defn memoize
  "Returns a memoized version of a referentially transparent function. The
  memoized version of the function keeps a cache of the mapping from arguments
  to results and, when calls with the same arguments are repeated often, has
  higher performance at the expense of higher memory use."
  {:added "1.0"
   :static true}
  [f]
  (let [mem (atom {})]
    (fn [& args]
      (if-let [e (find @mem (map wrap args))]
        (val e)
        (let [ret (apply f args)]
          (swap! mem assoc (map wrap args) ret)
          ret)))))

Now you can do this (which wouldn't work with the regular memoize):

user> (def r (lazy-cat (range 10000) (prn :foo) (range)))
#'user/r
user> (def f (memoize #(apply + (take 100 %))))
#'user/f
user> (f [1 2 3])
6
user> (f r)
4950

Original discussion of alternative implementations follows.

I don't think there is a built-in solution using pointer equality. To implement one as general as memoize, you'd have to implement a map structure using pointer-equality-based hashing (viz. System/identityHashCode). Or you could build a simple "most recently used" cache with clojure.lang.PersistentQueue.

like image 89
Michał Marczyk Avatar answered Sep 30 '22 03:09

Michał Marczyk