Are there any direct or indirect performance benefits of java 8 sequential streams?

Tags:

While going through articles of sequential streams the question came in my mind that are there any performance benefits of using sequential streams over traditional for loops or streams are just sequential syntactic sugar with an additional performance overhead?

Consider Below Example where I can not see any performance benefits of using sequential streams:

Stream.of("d2", "a2", "b1", "b3", "c")
    .filter(s -> {
        System.out.println("filter: " + s);
        return s.startsWith("a");
})
    .forEach(s -> System.out.println("forEach: " + s));

Using classic java:

String[] strings = {"d2", "a2", "b1", "b3", "c"};
        for (String s : strings)
        {
            System.out.println("Before filtering: " + s);
            if (s.startsWith("a"))
            {
                System.out.println("After Filtering: " + s);
            }
        }

Point Here is in streams processing of a2 starts only after all the operations on d2 is complete(Earlier I thought while d2 is being processed by foreach ,filter would have strated operating on a2 but that is not the case as per this article : https://winterbe.com/posts/2014/07/31/java8-stream-tutorial-examples/), same is the case with classic java, so what should be the motivation of using streams beyond "expressive" and "elegant" coding style?I know there are performance overheads for compiler while handling streams, does anyone know/have experienced about any performance benefits while using sequential streams?

977

asked Jan 17 '19 14:01

Hitesh

1 Answers

First of all, letting special cases, like omitting a redundant sorted operation or returning the known size on count(), aside, the time complexity of an operation usually doesn’t change, so all differences in execution timing are usually about a constant offset or a (rather small) factor, not fundamental changes.

You can always write a manual loop doing basically the same as the Stream implementation does internally. So, internal optimizations, as mentioned by this answer could always get dismissed with “but I could do the same in my loop”.

But… when we compare “the Stream” with “a loop”, is it really reasonable to assume that all manual loops are written in the most efficient manner for the particular use case? A particular Stream implementation will apply its optimizations to all use cases where applicable, regardless of the experience level of the calling code’s author. I’ve already seen loops missing the opportunity to short-circuit or performing redundant operations not needed for a particular use case.

Another aspect is the information needed to perform certain optimizations. The Stream API is built around the Spliterator interface which can provide characteristics of the source data, e.g. it allows to find out whether the data has a meaningful order needed to be retained for certain operations or whether it is already pre-sorted, to the natural order or with a particular comparator. It may also provide the expected number of elements, as an estimate or exact, when predictable.

A method receiving an arbitrary Collection, to implement an algorithm with an ordinary loop, would have a hard time to find out, whether there are such characteristics. A List implies a meaningful order, whereas a Set usually does not, unless it’s a SortedSet or a LinkedHashSet, whereas the latter is a particular implementation class, rather than an interface. So testing against all known constellations may still miss 3rd party implementations with special contracts not expressible by a predefined interface.

Of course, since Java 8, you could acquire a Spliterator yourself, to examine these characteristics, but that would change your loop solution to a non-trivial thing and also imply repeating the work already done with the Stream API.

There’s also another interesting difference between Spliterator based Stream solutions and conventional loops, using an Iterator when iterating over something other than an array. The pattern is to invoke hasNext on the iterator, followed by next, unless hasNext returned false. But the contract of Iterator does not mandate this pattern. A caller may invoke next without hasNext, even multiple times, when it is known to succeed (e.g. you do already know the collection’s size). Also, a caller may invoke hasNext multiple times without next in case the caller did not remember the result of the previous call.

As a consequence, Iterator implementations have to perform redundant operations, e.g. the loop condition is effectively checked twice, once in hasNext, to return a boolean, and once in next, to throw a NoSuchElementException when not fulfilled. Often, the hasNext has to perform the actual traversal operation and store the result into the Iterator instance, to ensure that the result stays valid until the subsequent next call. The next operation in turn, has to check whether such a traversal did already happen or whether it has to perform the operation itself. In practice, the hot spot optimizer may or may not eliminate the overhead imposed by the Iterator design.

In contrast, the Spliterator has a single traversal method, boolean tryAdvance(Consumer<? super T> action), which performs the actual operation and returns whether there was an element. This simplifies the loop logic significantly. There’s even the void forEachRemaining(Consumer<? super T> action) for non-short-circuiting operations, which allows the actual implementation to provide the entire looping logic. E.g., in case of ArrayList the operation will end up at a simple counting loop over the indices, performing a plain array access.

You may compare such design with, e.g. readLine() of BufferedReader, which performs the operation and returns null after the last element, or find() of a regex Matcher, which performs the search, updates the matcher’s state and returns the success state.

But the impact of such design differences is hard to predict in an environment with an optimizer designed specifically to identify and eliminate redundant operations. The takeaway is that there is some potential for Stream based solutions to turn out to be even faster, while it depends on a lot of factors whether it will ever materialize in a particular scenario. As said at the beginning, it’s usually not changing the overall time complexity, which would be more important to worry about.

197

answered Oct 20 '22 12:10

Holger

Related questions
                            
                                Spring test: configure datasource for org.springframework.test.context.jdbc.Sql
                            
                                How to declare a Kotlin function with return type 'void' for a java caller?
                            
                                Elasticsearch crashes after showing t: failed to read local state , exiting
                            
                                ClassCastException when opening excel file
                            
                                error: incompatible types: possible lossy conversion from int to short. I don't know why i'm getting this error message
                            
                                jackson serialization is excluding double value 0.0
                            
                                What functionality is the java code responsible for within the GraphEditor in the mxgraph example for javascript?
                            
                                How to use new androidx.media2.widget.VideoView
                            
                                How to improve performance of a simple select query in oracle
                            
                                google api refresh token returns null with react-google-login
                            
                                What is the best practice: Use prototype bean instead of new () operator
                            
                                Java Spring RestFull API
                            
                                Cannot use custom spring converter in unit test
                            
                                Can't Split Class into Smaller Ones
                            
                                Proper way to enrich Keycloak token via external service
                            
                                How to write one regular expression to meet all cases and print specified variable
                            
                                Is it reasonable to throw an exception from an asynchronous method?
                            
                                Setting Turkish and English locale: translate Turkish characters to Latin equivalents
                            
                                Is it possible to store secrets on the stack in java?
                            
                                invokeAndWait method in SwingUtilities

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Are there any direct or indirect performance benefits of java 8 sequential streams?

Tags:

java

java-8

Hitesh

People also ask

1 Answers

Holger

Recent Activity

Donate For Us