Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Holding onto the head of a sequence

Tags:

clojure

Reading a recent question I identified the function being discussed

(def fib-seq
    (lazy-cat [0 1] (map + (rest fib-seq) fib-seq)))

as holding onto the head of a sequence, but it occurred to me re-reading my answer that I had glossed over the details like they were obvious, so I went back to clarify and came up short. I know that fib-seq is a var and that as long as it's around it will hold all the elements in the sequence, but I'm not clear at all on the exact mechanics of how exactly the sequence is getting held onto. Any clarifications would be appreciated.

like image 511
Nathan Hughes Avatar asked Feb 06 '10 18:02

Nathan Hughes


1 Answers

Basically, regular GC rules apply... A sequence is just an object and holding onto its head means holding onto a reference to this object. This entails holding as much of the sequence as has already been realised in memory, because Clojure sequences are cached.

(A more detailed explanation follows -- see the fragment in bold for the gist of it... ;-))

'A sequence' in Clojure is an object which implements the ISeq interface. This provides methods which extract the first element of the sequence and the rest of the sequence (another object implementing ISeq). As a key detail, these take care not only of computing the correct object (first / rest of the sequence) and returning it to the caller, but also of caching the computed value in memory so any subsequent requests are faster -- and, more importantly, so that all requests for the same element of the sequence are guaranteed to return the same value, even if the ISeq is being generated on top of a mutable Java object which changes at some point. (Note this is absolutely crucial to the immutable semantics of Clojure sequences.)

A Var, on the other hand, is a container which holds, in rough terms, a 'pointer' to some Java object. If this happens to be an ISeq, then as long as the Var itself is not garbage collected (which it obviously won't ever be if it's a top level var in a currently existing namespace) or rebound, the ISeq itself will not be garbage collected and, in particular, the memory it uses for caching first / rest of sequence will not be released.

As for the other elements of the sequence: the 'rest' of the ISeq bound to the Var is an ISeq itself. Also, it gets cached by the first ISeq. Thus the first element of the 'rest' ISeq of an ISeq bound to a Var will never be garbage collected, because a reference to it is being held by the 'rest' ISeq of the ISeq bound to the Var and this ISeq won't be GC'd because it is being cached as the 'rest' component by the ISeq bound to the Var, which in turn won't be GC'd as long as it is bound to the Var, which in turn will normally never be GC'd because it's a top-level Var in a namespace.

Clearly the Var will be GC'd if it ceases to be held onto by its namespace (ns-unmap) or the namespace itself gets tossed (remove-ns). If it happens to have held an ISeq, that ISeq will be GC'd if and only if it isn't being held by some other bit of code -- the usual GC rules apply, of course. For bindings introduced with binding and local bindings introduced with let, all the above applies modulo lifetime issues of the bindings. (Which are not a subject of this Q.)

like image 199
Michał Marczyk Avatar answered Nov 14 '22 13:11

Michał Marczyk