These two expressions should mean the same thing:
Stream.from(1).filter(_ < 0).head
Stream.from(1).find(_ < 0)
The should loop around until they return Int.MinValue
. And that is exactly what the version with filter
does, but with find
an OutOfMemoryError
is produced. Looking at their implementations though, I can't figure out both versions don't produce an OutOfMemoryError
.
Here is the implementation of Stream.filter
:
override def filter(p: A => Boolean): Stream[A] = {
// optimization: drop leading prefix of elems for which f returns false
// var rest = this dropWhile (!p(_)) - forget DRY principle - GC can't collect otherwise
var rest = this
while (!rest.isEmpty && !p(rest.head)) rest = rest.tail
// private utility func to avoid `this` on stack (would be needed for the lazy arg)
if (rest.nonEmpty) Stream.filteredTail(rest, p)
else Stream.Empty
}
find
is inherited from LinearSeqOptimized
, with this definition:
override /*IterableLike*/
def find(p: A => Boolean): Option[A] = {
var these = this
while (!these.isEmpty) {
if (p(these.head)) return Some(these.head)
these = these.tail
}
None
}
They both have a while loop that discards elements of the Stream
that don't satisfy the predicate. Because this
should maintain a reference to the beginning of the Stream
all of these created elements should accumulate in memory until we run out of space. Unless I am really misunderstanding what is happening here, Stream.filter
is somehow eliminating this
from its stack frame before it enters the while loop. The comment in Stream.filter
on why dropWhile
isn't used looks like a hint, but I have no idea what it is referring to.
My next step would be learning how to disassemble and read JVM bytecode, but I'm really hoping someone knows what is happening here.
Return Value : Stream empty() returns an empty sequential stream. Note : An empty stream might be useful to avoid null pointer exceptions while callings methods with stream parameters.
If our Stream is ordered, it doesn't matter whether our data is being processed sequentially or in parallel; the implementation will maintain the encounter order of the Stream.
Java Stream is a data structure that is computed on-demand. It doesn't store data, it operates on the source data structure (collection and array) and produces pipelined data that we can use to implement internal iteration.
Stream filter() Methodfilter() is a intermediate Stream operation. It returns a Stream consisting of the elements of the given stream that match the given predicate. The filter() argument should be stateless predicate which is applied to each element in the stream to determine if it should be included or not.
It's a combination of HotSpot and the way Scala's traits are implemented.
If I turn HotSpot off with -Xint
, Stream.filter
will also die with an OutOfMemoryException
. In the generated bytecode itself, this
and the variables rest
and these
are stored in different memory locations, but because this
is only used to initialize these variables I believe HotSpot is smart enough to simply reuse the memory location for this
. This explains why Stream.filter
does not run out of memory.
HotSpot's optimization to Stream.filter
should also apply to LinearSeqOptimized.find
, however because of the way traits are implemented a reference to this
is preserved. When a method is implemented inside of a trait, Scala compiles that method into a static method. When a class inherits from that trait, Scala creates a small stub method that invokes the static method. So even though HotSpot optimizes the static method for LinearSeqOptimized.find
the stub method's stack frame still has a reference to this
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With