Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How are lazy streams implemented in Java 8?

Tags:

I am reading Java 8, specifically the "Streams API". I wanted to know how streams can be lazy?

I believe streams are just added as a library and there are no changes done to the language to support laziness. Also, I will be shocked if somebody tells me it's achieved through reflection.

like image 979
Learner Avatar asked May 14 '15 20:05

Learner


2 Answers

Why would you need reflection to get laziness? For example, consider this class:

class LazySeq<T> {

    private final List<T> list;
    private Predicate<? super T> predicate;

    public LazySeq(List<T> input) {
        this.list = new ArrayList<>(input);
    }

    //Here you just store the predicate, but you don't perform a filtering
    //You could also return a new LazySeq with a new state
    public LazySeq<T> filter(Predicate<? super T> predicate) {
        this.predicate = predicate;
        return this;
    }

    public void forEach(Consumer<? super T> consumer){
        if(predicate == null) {
            list.forEach(consumer);
        } else {
            for(T elem : list) {
                if(predicate.test(elem)) {
                    consumer.accept(elem);
                }
            }
        }
    }
}

When you call filter on the lazy seq, the filtering does not happen immediately so for example:

LazySeq<Integer> lazySeq = new LazySeq<>(Arrays.asList(1, 2, 3, 4));
lazySeq = lazySeq.filter(i -> i%2 == 0);

If you see the content of the sequence after calling filter, you'll see that it's always 1, 2, 3, 4. However when calling a terminal operation, such as forEach, then the filtering will be done before using the consumer. So for example:

lazySeq.filter(i -> i%2 == 0).forEach(System.out::println);

will print 2 and 4.

This is the same principle with Streams. From a source, you chain operations which have certains properties. These operations are either intermediate, which returns a lazy stream (such as filter or map), or terminal (such as forEach). Some of these terminal operations are short-circuiting (such as findFirst), so you might not traverse all the pipeline (you can think of an early return in a for loop that returns the index of a value in an array for example).

When calling a terminal operation, this chain of operations start to execute so that at the end you get the expected result.

Laziness can be achieved by storing a new state on the pipeline when an intermediate op is applied, and when you call a terminal op, you go by all the states one-by-one on the data.

The Stream API is not really implemented that way (it's a bit more complex) but really the principle is here.

like image 115
Alexis C. Avatar answered Sep 19 '22 21:09

Alexis C.


No reflection or proxies. Reflection and proxies come with a performance cost that should be avoided unless there isn't an alternative and performance is number one in Java.

What makes laziness possible is the functional style of doing things. Basically a stream starts with a source(ex: List), number of intermediate operations (ex: filters, map..) , and a terminal operation (ex: count, sum, etc..). Intermediate steps execute lazily because you pass functions (lambdas) that get chained in the pipeline to be executed at the terminal step.

ex: filter(Predicate<? super T>)

filter in this example expects a function that tells us whether an object in the stream meets some criteria or not.

A lot of features that come from Java 7 have been used to make this efficient. Ex: invoke dynamic for executing lambdas rather than proxies or anonymous inner classes and ForkJoin pools for parallel execution.

If you are interested in Java 8 internals,then you have to watch this talk given by the expert in the field Brian Goetz, it's on Youtube.

like image 30
Sleiman Jneidi Avatar answered Sep 21 '22 21:09

Sleiman Jneidi