Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is parallel stream slower?

I was playing around with infinite streams and made this program for benchmarking. Basically the bigger the number you provide, the faster it will finish. However, I was amazed to find that using a parellel stream resulted in exponentially worse performance compared to a sequential stream. Intuitively, one would expect an infinite stream of random numbers to be generated and evaluated much faster in a multi-threaded environment, but this appears not to be the case. Why is this?

    final int target = Integer.parseInt(args[0]);
    if (target <= 0) {
        System.err.println("Target must be between 1 and 2147483647");
        return;
    }

    final long startTime, endTime;
    startTime = System.currentTimeMillis();

    System.out.println(
        IntStream.generate(() -> new Double(Math.random()*2147483647).intValue())
        //.parallel()
        .filter(i -> i <= target)
        .findFirst()
        .getAsInt()
    );

    endTime = System.currentTimeMillis();
    System.out.println("Execution time: "+(endTime-startTime)+" ms");
like image 634
Sina Madani Avatar asked Dec 29 '16 13:12

Sina Madani


People also ask

Which is faster stream or parallel stream?

The performance of both streams degrades fast when the number of values increases. However, the parallel stream performs worse than the sequential stream in all cases.

Does parallel stream improve performance?

parallelStream() works parallelly on multiple threads. If we run this code multiple times then we can also see that each time we are getting a different order as output but this parallel stream boosts the performance so the situation where the order is not important is the best technique to use.

What does parallel and sequential stream do to increase performance?

A parallel stream has a much higher overhead compared to a sequential stream. Coordinating the threads takes a significant amount of time. Sequential streams sound like the default choice unless there is a performance problem to be addressed. The code used in this POC can be found on GitHub.

What is the disadvantage of parallel stream in Java 8?

1. Parallel Streams can actually slow you down. Java 8 brings the promise of parallelism as one of the most anticipated new features.


Video Answer


1 Answers

I totally agree with the other comments and answers but indeed your test behaves strange in case that the target is very low. On my modest laptop the parallel version is on average about 60x slower when very low targets are given. This extreme difference cannot be explained by the overhead of the parallelization in the stream APIs so I was also amazed :-). IMO the culprit lies here:

Math.random()

Internally this call relies on a global instance of java.util.Random. In the documentation of Random it is written:

Instances of java.util.Random are threadsafe. However, the concurrent use of the same java.util.Random instance across threads may encounter contention and consequent poor performance. Consider instead using ThreadLocalRandom in multithreaded designs.

So I think that the really poor performance of the parallel execution compared to the sequential one is explained by the thread contention in random rather than any other overheads. If you use ThreadLocalRandom instead (as recommended in the documentation) then the performance difference will not be so dramatic. Another option would be to implement a more advanced number supplier.

like image 163
Lachezar Balev Avatar answered Sep 26 '22 16:09

Lachezar Balev