When I write such code:
Stream<Integer> xs = Arrays.asList(1, 3, 5, 6, 7, 10).stream();
xs.map(x -> x * x).filter (x -> x > 15).forEach(System.out::println);
Java8 streams are divided into two sections; intermediate vs terminal operations, where the -AFAIK - actual action (under-the-hood iterations) is done in the terminal ops, while each intermediate ops appends its own -let me name it- Apply inner classes.
By this, there will be only one iteration over the list.
Sample code from JDK8:
@Override
@SuppressWarnings("unchecked")
public final <R> Stream<R> map(Function<? super P_OUT, ? extends R> mapper) {
Objects.requireNonNull(mapper);
return new StatelessOp<P_OUT, R>(this, StreamShape.REFERENCE,
StreamOpFlag.NOT_SORTED | StreamOpFlag.NOT_DISTINCT) {
@Override
Sink<P_OUT> opWrapSink(int flags, Sink<R> sink) {
return new Sink.ChainedReference<P_OUT, R>(sink) {
@Override
public void accept(P_OUT u) {
downstream.accept(mapper.apply(u));
}
};
}
};
}
When I write such code:
val xs = List(1, 3, 5, 6, 7, 10)
xs map (x => x * x) filter (x => x > 15) foreach (println)
I have been a while reading about it, but I never hear about such terms explicitly, moreover, the SDK implementation loops (either using recursion or regular loops) on each operation:
final override def map[B, That](f: A => B)(implicit bf: CanBuildFrom[List[A], B, That]): That = {
if (bf eq List.ReusableCBF) {
if (this eq Nil) Nil.asInstanceOf[That] else {
val h = new ::[B](f(head), Nil)
var t: ::[B] = h
var rest = tail
while (rest ne Nil) {
val nx = new ::(f(rest.head), Nil)
t.tl = nx
t = nx
rest = rest.tail
}
h.asInstanceOf[That]
}
}
else super.map(f)
}
Can we consider that Java implementation on the same thing would be much faster. (O(n) in Java
vs O(multiples of n) in Scala
)
Intermediate operations process the current stream of data (if any) and then return a new stream. Terminal operations as the name suggests are the last in the pipeline of operations performed on a stream. To know more about Streams API, look at the article Top 10 Stream API interview questions with answers.
A terminal operation in Java is a method applied to a stream as the final step. They only return another stream. They return final result.
1) The main difference between intermediate and terminal operations is that intermediate operations return a stream as a result and terminal operations return non-stream values like primitive or object or collection or may not return anything.
The Java Stream peek() method is a non-terminal operation that takes a Consumer ( java. util. function.
Java 8 streams are less-feature-complete cousins of Scala iterators save for their ability to compute in parallel.
If you don't need parallel computation (and most times the overhead is not worth it--only for big expensive jobs do you want it), then you can get the same sort of processing with .iterator
in Scala (and then to[Vector]
or whatever you want at the end).
Java 8 streams are manually specialized (Scala Iterator
is not), so there are use cases where they are way faster, but it's not because of the constant factor due to recreating collections along the way--at least, not if you throw a .iterator
in there. (Without .iterator
, Scala collections evaluate eagerly by default; Java collections do not have that option.)
The Scala equivalent for the Java 8 code you wrote is the following:
val xsi = Array(1, 3, 5, 6, 7, 10).iterator
xsi.map(x => x*x).filter(_ > 15).foreach(println)
There is no difference in the number of collections created here with Scala vs. Java.
It would probably be a good idea to adopt the very clear "terminal operation" language for Scala's Iterator documentation. The Java 8 stream docs are great in that they make it exquisitely clear when you're building up the description of the work and when you're finally doing it.
Scala also provides a Stream
class that memoizes old work (so you don't have to compute it a second time if you're reusing it), and various views
so you don't have to recreate the processing chain each time you want to use it. For example, with your squaring, you could
val xsv = Array(1, 3, 5, 6, 7, 10).view
val xsq = xsv.map(x => x*x)
xsq.filter(_ > 15).foreach(println)
xsq.filter(_ < 5).foreach(println)
whereas with Java 8 streams xsq
would be exhausted after the first terminal operation.
So Scala actually does everything (save parallelism) that Java 8 streams do, and quite a bit more besides, and has for a long time.
Scala also has parallelizing collections, but the Java 8 implementation is sufficiently superior in performance that at this point I'd recommend using them first. And again, if manual specialization is your thing, Java 8 streams have it for Int, Double, and Long, and that's a big performance win. (Note: your example, using asList
, is not manually specialized.)
But if you just want to queue up operations and not have the overhead of building intermediate collections, Scala does it. You just have to ask.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With