Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does Scala has intermediate/terminal ops as Java8 has?

In Java8

When I write such code:

Stream<Integer> xs = Arrays.asList(1, 3, 5, 6, 7, 10).stream();
xs.map(x -> x * x).filter (x -> x > 15).forEach(System.out::println);

Java8 streams are divided into two sections; intermediate vs terminal operations, where the -AFAIK - actual action (under-the-hood iterations) is done in the terminal ops, while each intermediate ops appends its own -let me name it- Apply inner classes.

By this, there will be only one iteration over the list.

Sample code from JDK8:

@Override
@SuppressWarnings("unchecked")
public final <R> Stream<R> map(Function<? super P_OUT, ? extends R> mapper) {
    Objects.requireNonNull(mapper);
    return new StatelessOp<P_OUT, R>(this, StreamShape.REFERENCE,
                                 StreamOpFlag.NOT_SORTED | StreamOpFlag.NOT_DISTINCT) {
        @Override
        Sink<P_OUT> opWrapSink(int flags, Sink<R> sink) {
            return new Sink.ChainedReference<P_OUT, R>(sink) {
                @Override
                public void accept(P_OUT u) {
                    downstream.accept(mapper.apply(u));
                }
            };
        }
    };
}

In Scala

When I write such code:

val xs = List(1, 3, 5, 6, 7, 10) 
xs map (x => x * x) filter (x => x > 15) foreach (println)

I have been a while reading about it, but I never hear about such terms explicitly, moreover, the SDK implementation loops (either using recursion or regular loops) on each operation:

final override def map[B, That](f: A => B)(implicit bf: CanBuildFrom[List[A], B, That]): That = {
if (bf eq List.ReusableCBF) {
  if (this eq Nil) Nil.asInstanceOf[That] else {
    val h = new ::[B](f(head), Nil)
    var t: ::[B] = h
    var rest = tail
    while (rest ne Nil) {
      val nx = new ::(f(rest.head), Nil)
      t.tl = nx
      t = nx
      rest = rest.tail
    }
    h.asInstanceOf[That]
  }
}
else super.map(f)
}

My questions is:

Can we consider that Java implementation on the same thing would be much faster. (O(n) in Java vs O(multiples of n) in Scala)

like image 492
Muhammad Hewedy Avatar asked Jul 27 '15 23:07

Muhammad Hewedy


People also ask

What are intermediate and terminal operations in Java 8?

Intermediate operations process the current stream of data (if any) and then return a new stream. Terminal operations as the name suggests are the last in the pipeline of operations performed on a stream. To know more about Streams API, look at the article Top 10 Stream API interview questions with answers.

What are terminal operators in Java 8?

A terminal operation in Java is a method applied to a stream as the final step. They only return another stream. They return final result.

What is intermediate and terminal methods?

1) The main difference between intermediate and terminal operations is that intermediate operations return a stream as a result and terminal operations return non-stream values like primitive or object or collection or may not return anything.

Which is not terminal operations in Java 8?

The Java Stream peek() method is a non-terminal operation that takes a Consumer ( java. util. function.


1 Answers

Java 8 streams are less-feature-complete cousins of Scala iterators save for their ability to compute in parallel.

If you don't need parallel computation (and most times the overhead is not worth it--only for big expensive jobs do you want it), then you can get the same sort of processing with .iterator in Scala (and then to[Vector] or whatever you want at the end).

Java 8 streams are manually specialized (Scala Iterator is not), so there are use cases where they are way faster, but it's not because of the constant factor due to recreating collections along the way--at least, not if you throw a .iterator in there. (Without .iterator, Scala collections evaluate eagerly by default; Java collections do not have that option.)

The Scala equivalent for the Java 8 code you wrote is the following:

val xsi = Array(1, 3, 5, 6, 7, 10).iterator
xsi.map(x => x*x).filter(_ > 15).foreach(println)

There is no difference in the number of collections created here with Scala vs. Java.

It would probably be a good idea to adopt the very clear "terminal operation" language for Scala's Iterator documentation. The Java 8 stream docs are great in that they make it exquisitely clear when you're building up the description of the work and when you're finally doing it.

Scala also provides a Stream class that memoizes old work (so you don't have to compute it a second time if you're reusing it), and various views so you don't have to recreate the processing chain each time you want to use it. For example, with your squaring, you could

val xsv = Array(1, 3, 5, 6, 7, 10).view
val xsq = xsv.map(x => x*x)
xsq.filter(_ > 15).foreach(println)
xsq.filter(_ < 5).foreach(println)

whereas with Java 8 streams xsq would be exhausted after the first terminal operation.

So Scala actually does everything (save parallelism) that Java 8 streams do, and quite a bit more besides, and has for a long time.

Scala also has parallelizing collections, but the Java 8 implementation is sufficiently superior in performance that at this point I'd recommend using them first. And again, if manual specialization is your thing, Java 8 streams have it for Int, Double, and Long, and that's a big performance win. (Note: your example, using asList, is not manually specialized.)

But if you just want to queue up operations and not have the overhead of building intermediate collections, Scala does it. You just have to ask.

like image 59
Rex Kerr Avatar answered Sep 21 '22 12:09

Rex Kerr