Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why doesn't Stream.filter run out of memory?

These two expressions should mean the same thing:

Stream.from(1).filter(_ < 0).head
Stream.from(1).find(_ < 0)

The should loop around until they return Int.MinValue. And that is exactly what the version with filter does, but with find an OutOfMemoryError is produced. Looking at their implementations though, I can't figure out both versions don't produce an OutOfMemoryError.

Here is the implementation of Stream.filter:

override def filter(p: A => Boolean): Stream[A] = {
  // optimization: drop leading prefix of elems for which f returns false
  // var rest = this dropWhile (!p(_)) - forget DRY principle - GC can't collect otherwise
  var rest = this
  while (!rest.isEmpty && !p(rest.head)) rest = rest.tail
  // private utility func to avoid `this` on stack (would be needed for the lazy arg)
  if (rest.nonEmpty) Stream.filteredTail(rest, p)
  else Stream.Empty
}

find is inherited from LinearSeqOptimized, with this definition:

override /*IterableLike*/
def find(p: A => Boolean): Option[A] = {
  var these = this
  while (!these.isEmpty) {
    if (p(these.head)) return Some(these.head)
    these = these.tail
  }
  None
}

They both have a while loop that discards elements of the Stream that don't satisfy the predicate. Because this should maintain a reference to the beginning of the Stream all of these created elements should accumulate in memory until we run out of space. Unless I am really misunderstanding what is happening here, Stream.filter is somehow eliminating this from its stack frame before it enters the while loop. The comment in Stream.filter on why dropWhile isn't used looks like a hint, but I have no idea what it is referring to.

My next step would be learning how to disassemble and read JVM bytecode, but I'm really hoping someone knows what is happening here.

like image 644
wingedsubmariner Avatar asked Apr 09 '14 15:04

wingedsubmariner


People also ask

What happens if stream is empty?

Return Value : Stream empty() returns an empty sequential stream. Note : An empty stream might be useful to avoid null pointer exceptions while callings methods with stream parameters.

Does stream filter keep order?

If our Stream is ordered, it doesn't matter whether our data is being processed sequentially or in parallel; the implementation will maintain the encounter order of the Stream.

Does stream save memory Java?

Java Stream is a data structure that is computed on-demand. It doesn't store data, it operates on the source data structure (collection and array) and produces pipelined data that we can use to implement internal iteration.

Does stream filter operate?

Stream filter() Methodfilter() is a intermediate Stream operation. It returns a Stream consisting of the elements of the given stream that match the given predicate. The filter() argument should be stateless predicate which is applied to each element in the stream to determine if it should be included or not.


1 Answers

It's a combination of HotSpot and the way Scala's traits are implemented.

If I turn HotSpot off with -Xint, Stream.filter will also die with an OutOfMemoryException. In the generated bytecode itself, this and the variables rest and these are stored in different memory locations, but because this is only used to initialize these variables I believe HotSpot is smart enough to simply reuse the memory location for this. This explains why Stream.filter does not run out of memory.

HotSpot's optimization to Stream.filter should also apply to LinearSeqOptimized.find, however because of the way traits are implemented a reference to this is preserved. When a method is implemented inside of a trait, Scala compiles that method into a static method. When a class inherits from that trait, Scala creates a small stub method that invokes the static method. So even though HotSpot optimizes the static method for LinearSeqOptimized.find the stub method's stack frame still has a reference to this.

like image 68
wingedsubmariner Avatar answered Sep 18 '22 23:09

wingedsubmariner