Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Proper usage of parallel streams in Java

I'm experimenting with parallel streams in Java and for that I've the following code for calculating number of primes before n.

Basically I'm having 2 methods

  • calNumberOfPrimes(long n) - 4 different variants
  • isPrime(long n) - 2 different variants

Actually I'm having 2 different variants of each of the above method, one variant that uses parallel streams and other variant that don't use parallel streams.

    // itself uses parallel stream and calls parallel variant isPrime
    private static long calNumberOfPrimesPP(long n) {
        return LongStream
                .rangeClosed(2, n)
                .parallel()
                .filter(i -> isPrimeParallel(i))
                .count();
    }

    // itself uses parallel stream and calls non-parallel variant isPrime
    private static long calNumberOfPrimesPNP(long n) {
        return LongStream
                .rangeClosed(2, n)
                .parallel()
                .filter(i -> isPrimeNonParallel(i))
                .count();
    }

    // itself uses non-parallel stream and calls parallel variant isPrime
    private static long calNumberOfPrimesNPP(long n) {
        return LongStream
                .rangeClosed(2, n)
                .filter(i -> isPrimeParallel(i))
                .count();
    }

    // itself uses non-parallel stream and calls non-parallel variant isPrime
    private static long calNumberOfPrimesNPNP(long n) {
        return LongStream
                .rangeClosed(2, n)
                .filter(i -> isPrimeNonParallel(i))
                .count();
    }
    // uses parallel stream
    private static boolean isPrimeParallel(long n) {
        return LongStream
                .rangeClosed(2, (long) Math.sqrt(n))
                .parallel()
                .noneMatch(i -> n % i == 0);
    }

    // uses non-parallel stream
    private static boolean isPrimeNonParallel(long n) {
        return LongStream
                .rangeClosed(2, (long) Math.sqrt(n))
                .noneMatch(i -> n % i == 0);
    }

I'm trying to reason out which amongst calNumberOfPrimesPP, calNumberOfPrimesPNP, calNumberOfPrimesNPP and calNumberOfPrimesNPNP is the best in terms of proper usage of parallel streams with efficiency and why it is the best.

I tried to time all these 4 methods in 50 times and took the average using the following code:

    public static void main(String[] args) throws Exception {
        int iterations = 50;
        int n = 1000000;
        double pp, pnp, npp, npnp;
        pp = pnp = npp = npnp = 0;
        for (int i = 0; i < iterations; i++) {
            Callable<Long> runner1 = () -> calNumberOfPrimesPP(n);
            Callable<Long> runner2 = () -> calNumberOfPrimesPNP(n);
            Callable<Long> runner3 = () -> calNumberOfPrimesNPP(n);
            Callable<Long> runner4 = () -> calNumberOfPrimesNPNP(n);

            pp += TimeIt.timeIt(runner1);
            pnp += TimeIt.timeIt(runner2);
            npp += TimeIt.timeIt(runner3);
            npnp += TimeIt.timeIt(runner4);
        }
        System.out.println("___________final results___________");
        System.out.println("avg PP = " + pp / iterations);
        System.out.println("avg PNP = " + pnp / iterations);
        System.out.println("avg NPP = " + npp / iterations);
        System.out.println("avg NPNP = " + npnp / iterations);
    }

TimeIt.timeIt simply returns the execution time in milli-seconds. I got the following output:

___________final results___________
avg PP = 2364.51336366
avg PNP = 265.27284506
avg NPP = 11424.194316620002
avg NPNP = 1138.15516624

Now I'm trying to reason about the above execution times:

  • The PP variant is not as fast as PNP variant because all parallel streams use common fork-join thread pool and if we submit a long-running task, we are effectively blocking all threads in the pool.
  • But the above argument should also work for NPP variant and so the NPP variant should also be approximately as fast as the PNP variant. (But this is not the case, NPP variant is the worst in terms of time taken). Can someone please explain the reason behind this?

My questions:

  • Is my reasoning correct for the small running time of PNP variant?
  • Am I missing something?
  • Why NPP variant is the worst (in terms of running time)?

How TimeIt is measuring time:

class TimeIt {
    private TimeIt() {
    }

    /**
     * returns the time to execute the Callable in milliseconds
     */
    public static <T> double timeIt(Callable<T> callable) throws Exception {
        long start = System.nanoTime();
        System.out.println(callable.call());
        return (System.nanoTime() - start) / 1.0e6;
    }
}

PS: I understand that this is not the best method to count the number of primes. Sieve of Eratosthenes and other more sophisticated methods exists to do that. But by this example I just want to understand the behaviour of parallel streams and when to use them.

like image 794
Lavish Kothari Avatar asked Feb 24 '19 07:02

Lavish Kothari


People also ask

Is it good to use parallel stream?

Parallel Stream It is a very useful feature of Java to use parallel processing, even if the whole program may not be parallelized. Parallel stream leverage multi-core processors, which increases its performance.

When would you not use parallel streams?

Similarly, don't use parallel if the stream is ordered and has much more elements than you want to process, e.g. This may run much longer because the parallel threads may work on plenty of number ranges instead of the crucial one 0-100, causing this to take very long time.

What is true about parallel streams in Java?

Usually, any Java code that has only one processing stream, where it is sequentially executed. However, by using parallel streams, one can separate the Java code into more than one stream, which is executed in parallel on their separate cores, and the end result is the combination of the individual results.


1 Answers

I think, it is clear, why NPP is so slow.

Arrange your resulting numbers in a table:

       |    _P    |   _NP
-------+----------+---------
  P_   |   2364   |   265
-------+----------+---------
  NP_  |  11424   |  1138
-------+----------+---------

So you see that it is always faster when the outer stream is parallel. This is because there is much work to be done in the stream. So the additional overhead for handling the parallel stream is low compared to the work to be done.

You see also that it is always faster when the inner stream is not parallel. isPrimeNonParallel is faster than isPrimeParallel. This is because there is not much work to be done in the stream. In most cases it is clear after a few steps that the number is not prime. Half of the numbers are even (only one step). The additional overhead for handling the parallel stream is high compared to the work to be done.

like image 152
Donat Avatar answered Oct 23 '22 18:10

Donat