I'm trying to understand clojure's lazy-seq
operator, and the concept of lazy evaluation in general. I know the basic idea behind the concept: Evaluation of an expression is delayed until the value is needed.
In general, this is achievable in two ways:
With lazy evaluation techniques, it is possible to construct infinite data structures that are evaluated as consumed. These infinite sequences utilizes lambdas, closures and recursion. In clojure, these infinite data structures are generated using lazy-seq
and cons
forms.
I want to understand how lazy-seq
does it's magic. I know it is actually a macro. Consider the following example.
(defn rep [n]
(lazy-seq (cons n (rep n))))
Here, the rep
function returns a lazily-evaluated sequence of type LazySeq
, which now can be transformed and consumed (thus evaluated) using the sequence API. This API provides functions take
, map
, filter
and reduce
.
In the expanded form, we can see how lambda is utilized to store the recipe for the cell without evaluating it immediately.
(defn rep [n]
(new clojure.lang.LazySeq (fn* [] (cons n (rep n)))))
LazySeq
?(reduce + (take 3 (map inc (rep 5))))
map
applied to the sequence,take
limit the sequence andreduce
evaluate the sequence?Also, how do these functions work with either a Vector
or a LazySeq
?
Also, is it possible to generate nested infinite data structures?: list containing lists, containing lists, containing lists... going infinitely wide and deep, evaluated as consumed with the sequence API?
And last question, is there any practical difference between this
(defn rep [n]
(lazy-seq (cons n (rep n))))
and this?
(defn rep [n]
(cons n (lazy-seq (rep n))))
That's a lot of questions!
If you take a look at LazySeq
's class source code you will notice that it implements ISeq
interface providing methods like first
, more
and next
.
Functions like map
, take
and filter
are built using lazy-seq
(they produce lazy sequences) and first
and rest
(which in turn uses more
) and that's how they can work with lazy seq as their input collection - by using first
and more
implementations of LazySeq
class.
(reduce + (take 3 (map inc (rep 5))))
The key is to look how LazySeq.first
works. It will invoke the wrapped function to obtain and memoize the result. In your case it will be the following code:
(cons n (rep n))
Thus it will be a cons cell with n
as its value and another LazySeq
instance (result of a recursive call to rep
) as its rest
part. It will become the realised value of this LazySeq
object and first
will return the value of the cached cons cell.
When you call more
on it, it will in the same way ensure that the value of the particular LazySeq
object is realised (or reused memoized value) and call more
on it (in this case more
on the cons cell containing another LazySeq
object).
Once you obtain another instance of LazySeq
object with more
the story repeats when you call first
on it.
map
and take
will create another lazy-seq
that will call first
and more
of the collection passed as their argument (just another lazy seq) so it will be similar story. The difference will be only in how the values passed to cons
are generated (e.g. calling f
to a value obtained by first
invoked on the LazySeq
value mapped over in map
instead of a raw value like n
in your rep
function).
With reduce
it's a bit simpler as it will use loop
with first
and more
to iterate over the input lazy seq and apply the reducing function to produce the final result.
As the actual implementation looks like for map
and take
I encourage you to check their source code - it's quite easy to follow.
As mentioned above, map
, take
and other functions work in terms of first
and rest
(reminder - rest
is implemented on top of more
). Thus we need to explain how first
and rest
/more
can work with different collection types: they check if the collection implements ISeq
(and then it implement those functions directly) or they try to create a seq
view of the collection and coll its implementation of first
and more
.
It's definitely possible but I am not sure what the exact data shape you would like to get. Do you mean getting a lazy seq which generates another sequence as it's value (instead of a single value like n
in your rep
) but returns it as a flat sequence?
(defn nested-cons [n]
(lazy-seq (cons (repeat n n) (nested-cons (inc n)))))
(take 3 (nested-cons 1))
;; => ((1) (2 2) (3 3 3))
that would rather return (1 2 2 3 3 3)
?
For such cases you might use concat
instead of cons
which creates a lazy sequence of two or more sequences:
(defn nested-concat [n]
(lazy-seq (concat (repeat n n) (nested-concat (inc n)))))
(take 6 (nested-concat 1))
;; => (1 2 2 3 3 3)
(defn rep [n]
(lazy-seq (cons n (rep n))))
and this?
(defn rep [n]
(cons n (lazy-seq (rep n))))
In this particular case not really. But in the case where a cons cell doesn't wrap a raw value but a result of a function call to calculate it, the latter form is not fully lazy. For example:
(defn calculate-sth [n]
(println "Calculating" n)
n)
(defn rep1 [n]
(lazy-seq (cons (calculate-sth n) (rep1 (inc n)))))
(defn rep2 [n]
(cons (calculate-sth n) (lazy-seq (rep2 (inc n)))))
(take 0 (rep1 1))
;; => ()
(take 0 (rep2 1))
;; Prints: Calculating 1
;; => ()
Thus the latter form will evaluate its first element even if you might not need it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With