Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why the constant() solution is more efficient than the easier one in "FP in Scala" 5.8?

I am looking at exercise 5.8 in book "FP in Scala" and the question is:

"Generalize ones slightly to the function constant, which returns an infinite Stream of a given value."

def constant[A](a: A): Stream[A]

My solution is:

def constant[A](a: A): Stream[A] =
  Stream.cons(a, constant(a))

while I refer to the standard solution, it was:

// This is more efficient than `cons(a, constant(a))` since it's just
// one object referencing itself.
def constant[A](a: A): Stream[A] = {
  lazy val tail: Stream[A] = Cons(() => a, () => tail) 
  tail
}

which says "more efficient", see here.

Can I know why is it more efficient? AFAIK, the cons constructor in Streams already marks the head and tail as lazy:

def cons[A](hd: => A, tl: => Stream[A]): Stream[A] = { 
  lazy val head = hd 
  lazy val tail = tl 
  Cons(() => head, () => tail) 
}

Why do we still need to mark the tail as lazy again? I am not quite understand the point here.

like image 273
injoy Avatar asked Jan 04 '19 19:01

injoy


1 Answers

This is more a comment to @ElBaulP answer than an answer in its own rights but it is too big for a comment.

I think you missed the root of the optimization even though it is explicitly stated in the comment above. The fact that the val tail in the constant is lazy is an implementation detail or rather a trick to make the main source of the optimization possible. And the main source of the optimization is the fact that tail is self-referential.

For a moment let's get rid of all the lazy stuff. Assume that Cons is defined as

case class Cons[+A](h: A, t: () => Stream[A]) extends Stream[A]

and let's define constant as

def constant[A](a: A): Stream[A] = {
   val tailFunc: () => Stream[A] =  () => tail
   val tail: Stream[A] = Cons(a, tailFunc)
   tail
}

Yes, this is a syntactically invalid program because tailFunc uses tail defined only in the next line. But imagine Scala can compile that. I think now it is quite clear that such an implementation of constant will only create one instance of class Cons per call and that instance will be used no matter how long you try to iterate over the returned stream.

What defining tail as lazy allows is just to make the code logically equivalent to the above one compile without errors. From the implementation point of view it is similar to something like that:

class Lazy[A](var value:A)

def constant[A](a: A): Stream[A] = {
   val lazyTail: Lazy[Stream[A]] = new Lazy(null)
   // this tailFunc works because lazyTail is already fixed and can be safely 
   // captured although lazyTail.value will be changed
   val tailFunc: () => Stream[A] =  () => lazyTail.value 
   lazyTail.value = new Stream(a, tailFunc)
   lazyTail.value
}

This code does not exactly matches actual lazy implementation in many details because it is actually eager but I think it shows that you don't really need lazy to make the optimization work (but at the cost of using a var, which Scala compiler does anyway inside its real, more complicated lazy implementation).

In your naive implementation

def constant[A](a: A): Stream[A] =  Stream.cons(a, constant(a))

lazy evaluation of tail allows you to not fail immediately with stack overflow because of recursive calls of constant from itself. But still when you do constant(whatever).tail, the tail is being evaluated so the constant method is called once again and a new Cons object is created. And this will happen every time tail is called on new head.

To re-state it once again, lazy val tail is just a trick to allow tail to reference itself at creation and the really important part is the fact that tail references itself which allows to use only one object for iteration rather than one object per each next .tail call.

like image 81
SergGr Avatar answered Sep 18 '22 23:09

SergGr