I'm experimenting with parallel streams in Java and for that I've the following code for calculating number of primes before <code>n</code>. Basically I'm having 2 methods <ul> <li> <code>calNumberOfPrimes(long n)</code> - 4 different variants</li> <li> <code>isPrime(long n)</code> - 2 different variants</li> </ul> Actually I'm having 2 different variants of each of the above method, one variant that uses parallel streams and other variant that don't use parallel streams. <pre class="prettyprint lang-java prettyprint-override"><code> // itself uses parallel stream and calls parallel variant isPrime private static long calNumberOfPrimesPP(long n) { return LongStream .rangeClosed(2, n) .parallel() .filter(i -> isPrimeParallel(i)) .count(); } // itself uses parallel stream and calls non-parallel variant isPrime private static long calNumberOfPrimesPNP(long n) { return LongStream .rangeClosed(2, n) .parallel() .filter(i -> isPrimeNonParallel(i)) .count(); } // itself uses non-parallel stream and calls parallel variant isPrime private static long calNumberOfPrimesNPP(long n) { return LongStream .rangeClosed(2, n) .filter(i -> isPrimeParallel(i)) .count(); } // itself uses non-parallel stream and calls non-parallel variant isPrime private static long calNumberOfPrimesNPNP(long n) { return LongStream .rangeClosed(2, n) .filter(i -> isPrimeNonParallel(i)) .count(); } // uses parallel stream private static boolean isPrimeParallel(long n) { return LongStream .rangeClosed(2, (long) Math.sqrt(n)) .parallel() .noneMatch(i -> n % i == 0); } // uses non-parallel stream private static boolean isPrimeNonParallel(long n) { return LongStream .rangeClosed(2, (long) Math.sqrt(n)) .noneMatch(i -> n % i == 0); } </code></pre> I'm trying to reason out which amongst <code>calNumberOfPrimesPP</code>, <code>calNumberOfPrimesPNP</code>, <code>calNumberOfPrimesNPP</code> and <code>calNumberOfPrimesNPNP</code> is the best in terms of proper usage of parallel streams with efficiency and why it is the best. I tried to time all these 4 methods in 50 times and took the average using the following code: <pre class="prettyprint lang-java prettyprint-override"><code> public static void main(String[] args) throws Exception { int iterations = 50; int n = 1000000; double pp, pnp, npp, npnp; pp = pnp = npp = npnp = 0; for (int i = 0; i < iterations; i++) { Callable<Long> runner1 = () -> calNumberOfPrimesPP(n); Callable<Long> runner2 = () -> calNumberOfPrimesPNP(n); Callable<Long> runner3 = () -> calNumberOfPrimesNPP(n); Callable<Long> runner4 = () -> calNumberOfPrimesNPNP(n); pp += TimeIt.timeIt(runner1); pnp += TimeIt.timeIt(runner2); npp += TimeIt.timeIt(runner3); npnp += TimeIt.timeIt(runner4); } System.out.println("___________final results___________"); System.out.println("avg PP = " + pp / iterations); System.out.println("avg PNP = " + pnp / iterations); System.out.println("avg NPP = " + npp / iterations); System.out.println("avg NPNP = " + npnp / iterations); } </code></pre> <code>TimeIt.timeIt</code> simply returns the execution time in milli-seconds. I got the following output: <pre class="prettyprint lang-sh prettyprint-override"><code>___________final results___________ avg PP = 2364.51336366 avg PNP = 265.27284506 avg NPP = 11424.194316620002 avg NPNP = 1138.15516624 </code></pre> Now I'm trying to reason about the above execution times: <ul> <li>The <code>PP</code> variant is not as fast as <code>PNP</code> variant because all parallel streams use common fork-join thread pool and if we submit a long-running task, we are effectively blocking all threads in the pool. </li> <li>But the above argument should also work for <code>NPP</code> variant and so the <code>NPP</code> variant should also be approximately as fast as the <code>PNP</code> variant. (But this is not the case, <code>NPP</code> variant is the worst in terms of time taken). Can someone please explain the reason behind this?</li> </ul> My questions: <ul> <li>Is my reasoning correct for the small running time of <code>PNP</code> variant?</li> <li>Am I missing something?</li> <li>Why <code>NPP</code> variant is the worst (in terms of running time)?</li> </ul> How <code>TimeIt</code> is measuring time: <pre class="prettyprint lang-java prettyprint-override"><code>class TimeIt { private TimeIt() { } /** * returns the time to execute the Callable in milliseconds */ public static <T> double timeIt(Callable<T> callable) throws Exception { long start = System.nanoTime(); System.out.println(callable.call()); return (System.nanoTime() - start) / 1.0e6; } } </code></pre> PS: I understand that this is not the best method to count the number of primes. Sieve of Eratosthenes and other more sophisticated methods exists to do that. But by this example I just want to understand the behaviour of parallel streams and when to use them.

I think, it is clear, why NPP is so slow. Arrange your resulting numbers in a table: <pre class="prettyprint"><code> | _P | _NP -------+----------+--------- P_ | 2364 | 265 -------+----------+--------- NP_ | 11424 | 1138 -------+----------+--------- </code></pre> So you see that it is always faster when the outer stream is parallel. This is because there is much work to be done in the stream. So the additional overhead for handling the parallel stream is low compared to the work to be done. You see also that it is always faster when the inner stream is not parallel. <code>isPrimeNonParallel</code> is faster than <code>isPrimeParallel</code>. This is because there is not much work to be done in the stream. In most cases it is clear after a few steps that the number is not prime. Half of the numbers are even (only one step). The additional overhead for handling the parallel stream is high compared to the work to be done.

Proper usage of parallel streams in Java

Tags:

java

parallel-processing

java-8

java-stream

forkjoinpool

I'm experimenting with parallel streams in Java and for that I've the following code for calculating number of primes before n.

Basically I'm having 2 methods

calNumberOfPrimes(long n) - 4 different variants
isPrime(long n) - 2 different variants

Actually I'm having 2 different variants of each of the above method, one variant that uses parallel streams and other variant that don't use parallel streams.

Click to copy

    // itself uses parallel stream and calls parallel variant isPrime
    private static long calNumberOfPrimesPP(long n) {
        return LongStream
                .rangeClosed(2, n)
                .parallel()
                .filter(i -> isPrimeParallel(i))
                .count();
    }

    // itself uses parallel stream and calls non-parallel variant isPrime
    private static long calNumberOfPrimesPNP(long n) {
        return LongStream
                .rangeClosed(2, n)
                .parallel()
                .filter(i -> isPrimeNonParallel(i))
                .count();
    }

    // itself uses non-parallel stream and calls parallel variant isPrime
    private static long calNumberOfPrimesNPP(long n) {
        return LongStream
                .rangeClosed(2, n)
                .filter(i -> isPrimeParallel(i))
                .count();
    }

    // itself uses non-parallel stream and calls non-parallel variant isPrime
    private static long calNumberOfPrimesNPNP(long n) {
        return LongStream
                .rangeClosed(2, n)
                .filter(i -> isPrimeNonParallel(i))
                .count();
    }
    // uses parallel stream
    private static boolean isPrimeParallel(long n) {
        return LongStream
                .rangeClosed(2, (long) Math.sqrt(n))
                .parallel()
                .noneMatch(i -> n % i == 0);
    }

    // uses non-parallel stream
    private static boolean isPrimeNonParallel(long n) {
        return LongStream
                .rangeClosed(2, (long) Math.sqrt(n))
                .noneMatch(i -> n % i == 0);
    }

I'm trying to reason out which amongst calNumberOfPrimesPP, calNumberOfPrimesPNP, calNumberOfPrimesNPP and calNumberOfPrimesNPNP is the best in terms of proper usage of parallel streams with efficiency and why it is the best.

I tried to time all these 4 methods in 50 times and took the average using the following code:

Click to copy

    public static void main(String[] args) throws Exception {
        int iterations = 50;
        int n = 1000000;
        double pp, pnp, npp, npnp;
        pp = pnp = npp = npnp = 0;
        for (int i = 0; i < iterations; i++) {
            Callable<Long> runner1 = () -> calNumberOfPrimesPP(n);
            Callable<Long> runner2 = () -> calNumberOfPrimesPNP(n);
            Callable<Long> runner3 = () -> calNumberOfPrimesNPP(n);
            Callable<Long> runner4 = () -> calNumberOfPrimesNPNP(n);

            pp += TimeIt.timeIt(runner1);
            pnp += TimeIt.timeIt(runner2);
            npp += TimeIt.timeIt(runner3);
            npnp += TimeIt.timeIt(runner4);
        }
        System.out.println("___________final results___________");
        System.out.println("avg PP = " + pp / iterations);
        System.out.println("avg PNP = " + pnp / iterations);
        System.out.println("avg NPP = " + npp / iterations);
        System.out.println("avg NPNP = " + npnp / iterations);
    }

TimeIt.timeIt simply returns the execution time in milli-seconds. I got the following output:

Click to copy

___________final results___________
avg PP = 2364.51336366
avg PNP = 265.27284506
avg NPP = 11424.194316620002
avg NPNP = 1138.15516624

Now I'm trying to reason about the above execution times:

The PP variant is not as fast as PNP variant because all parallel streams use common fork-join thread pool and if we submit a long-running task, we are effectively blocking all threads in the pool.
But the above argument should also work for NPP variant and so the NPP variant should also be approximately as fast as the PNP variant. (But this is not the case, NPP variant is the worst in terms of time taken). Can someone please explain the reason behind this?

My questions:

Is my reasoning correct for the small running time of PNP variant?
Am I missing something?
Why NPP variant is the worst (in terms of running time)?

How TimeIt is measuring time:

Click to copy

class TimeIt {
    private TimeIt() {
    }

    /**
     * returns the time to execute the Callable in milliseconds
     */
    public static <T> double timeIt(Callable<T> callable) throws Exception {
        long start = System.nanoTime();
        System.out.println(callable.call());
        return (System.nanoTime() - start) / 1.0e6;
    }
}

PS: I understand that this is not the best method to count the number of primes. Sieve of Eratosthenes and other more sophisticated methods exists to do that. But by this example I just want to understand the behaviour of parallel streams and when to use them.

794

asked Feb 24 '19 07:02

Lavish Kothari

1 Answers

I think, it is clear, why NPP is so slow.

Arrange your resulting numbers in a table:

Click to copy

       |    _P    |   _NP
-------+----------+---------
  P_   |   2364   |   265
-------+----------+---------
  NP_  |  11424   |  1138
-------+----------+---------

So you see that it is always faster when the outer stream is parallel. This is because there is much work to be done in the stream. So the additional overhead for handling the parallel stream is low compared to the work to be done.

You see also that it is always faster when the inner stream is not parallel. isPrimeNonParallel is faster than isPrimeParallel. This is because there is not much work to be done in the stream. In most cases it is clear after a few steps that the number is not prime. Half of the numbers are even (only one step). The additional overhead for handling the parallel stream is high compared to the work to be done.

152

answered Oct 23 '22 18:10

Donat

Related questions
                            
                                How to test the whole flow of a Spring Batch application?
                            
                                Spring profile is ignored when reading properties from application.yml
                            
                                What is the difference between server.port and local.server.port in Spring Boot?
                            
                                Using a local variable when initializing a static variable
                            
                                Is there a way to collect a map using "groupingBy" for MULTIPLE elements within a nested structure?
                            
                                Apache Commons CSV Mapping not found
                            
                                Primary spring bean overridden by ImportResource in Configuration
                            
                                How to get MQTT subscriptions
                            
                                Prohibit brackets after method signature in Java code
                            
                                Mobile authentication using QR in web application
                            
                                jgit - git diff based on file extension
                            
                                Comparing two data frames in Spark (performance)
                            
                                How to catch exceptions within Java 8 Stream.flatMap(..)
                            
                                how to sort nested lists in java
                            
                                How to pass a XML document to XSL file using Javax.xml.transformer API?
                            
                                How to generate Swagger codegen Java models as JPA Entities
                            
                                Android AAR lib - resource linking failed
                            
                                How to override Spring Bean in integration test with custom bean definition?
                            
                                A steady number of HBase requests are taking almost exactly 5000ms to complete (successfully) despite lower timeouts. No idea why
                            
                                Understanding how the main class affects JPMS

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Proper usage of parallel streams in Java

Tags:

java

parallel-processing

java-8

java-stream

forkjoinpool

Lavish Kothari

People also ask

1 Answers

Donat

Recent Activity

Donate For Us