I'm following the Functional Programming in Scala book. Here is the code snippet from Stream definition and functions to construct constant
using smart constructor and using unfold
:
sealed trait Stream[+A]
case object Empty extends Stream[Nothing]
case class Cons[+A](h: () => A, tl: () => Stream[A]) extends Stream[A]
object Stream {
def cons[A](h: => A, tl: => Stream[A]): Stream[A] = {
lazy val head = h
lazy val tail = tl
Cons(() => head, () => tail)
}
def empty[A]: Stream[A] = Empty
def constant[A](a: A): Stream[A] = cons(a, constant(a))
def unfold[A, S](z: S)(f: S => Option[(A, S)]): Stream[A] =
f(z).fold(empty[A])(x => cons(x._1, unfold(x._2)(f)))
def constantViaUnfold[A](a: A): Stream[A] = unfold(a)(x => Some((x, x)))
}
There is a footnote saying that:
Using
unfold
to defineconstant
means that we won't get sharing as in the recursive definition. The recursive definition consumes constant memory even if we keep a reference to it around while traversing it, while the unfold-based implementation does not. Preserving sharing isn’t something we usually rely on when programming with streams, since it’s extremely delicate and not tracked by the types. For instance, sharing is destroyed when calling evenxs.map(x => x).
What do the authors mean when saying that we are not getting sharing? What exactly is shared and why it's not preserved in unfold
version? Also the sentence about constant memory consumption is not clear as well.
Let's say you create a new list like that:
val n0 = Nil //List()
val n1 = 1 :: n0 //List(1)
val n2 = 2 :: n1 //List(2,1)
val n3 = 3 :: n2 //List(3,2,1)
If you build a list like that, it's easy to notice that n3 is really n2 with 3 prepended and n2 is just n1 with 2 prepended, etc. So reference to 1 is shared by n1, n2 and n3 and reference to 2 is shared by n2 and n2, etc. You could write it also like:
Cons(3, Cons(2, Cons(1, Nil)))
This is also the case in an example from FPinS when you create Stream recursively. You Stream is built from nested Cons and every substream shares elements with its parent. So when the next Stream is created it just wraps old one inside Cons with the new element. But since Stream is lazy all this building of Cons hierarchy will be only done if you materialize it, for example by calling toList.
Building stream like that gives you also constant memory consumption because creating the next stream from previous will only cost memory for a new element.
And why it is not a case with unfold
? Because it builds Stream "other way around". So it goes like:
Cons(x, next_iteration_of_unfold) //1st iteration
Cons(x, Cons(x, next_iteration_of_unfold)) //2nd iteration
Cons(x, Cons(x, Cons(x, next_iteration_of_unfold))) //3rd iteration, etc.
So as you can see nothing can be shared.
EDIT:
You can see how the materialized stream looks by calling take
on final implementation on the book and adding toString
to Cons
:
override def toString: String = s"Cons(${h()}, ${tl()})"
And then:
Stream.ones.take(3)
You will see:
Cons(1, Cons(1, Cons(1, Empty)))
I will provide the original book's (2014 year edition) quote to avoid confusion :
Using unfold to define constant and ones means that we don’t get sharing as in the recursive definition val ones: Stream[Int] = cons(1, ones). The recursive definition consumes constant memory even if we keep a reference to it around while traversing it, while the unfold-based implementation does not. Preserving sharing isn’t something we usually rely on when programming with streams, since it’s extremely delicate and not tracked by the types. For instance, sharing is destroyed when calling even xs.map(x => x).
As you can see the problem is a bit different here now. It's about reusing Cons
instance, so Stream
just reference to itself here.
So it was supposed by the author to have implementation like this:
//Imp1
def constant[A](a: A): Stream[A] = {
lazy val ones: Stream[A] = cons(a, ones)
ones
}
instead of just
//Imp2
def constant[A](a: A): Stream[A] = cons(a, constant(a))
As you can see Imp1
creates only one instance of Cons
for entire stream. Imp2
and unfold
version from your example generate new Cons
on every next call instead, so both methods generate streams that don't share Cons
instance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With