I'm experimenting with parallel streams in Java and for that I've the following code for calculating number of primes before n
.
Basically I'm having 2 methods
calNumberOfPrimes(long n)
- 4 different variantsisPrime(long n)
- 2 different variantsActually I'm having 2 different variants of each of the above method, one variant that uses parallel streams and other variant that don't use parallel streams.
// itself uses parallel stream and calls parallel variant isPrime
private static long calNumberOfPrimesPP(long n) {
return LongStream
.rangeClosed(2, n)
.parallel()
.filter(i -> isPrimeParallel(i))
.count();
}
// itself uses parallel stream and calls non-parallel variant isPrime
private static long calNumberOfPrimesPNP(long n) {
return LongStream
.rangeClosed(2, n)
.parallel()
.filter(i -> isPrimeNonParallel(i))
.count();
}
// itself uses non-parallel stream and calls parallel variant isPrime
private static long calNumberOfPrimesNPP(long n) {
return LongStream
.rangeClosed(2, n)
.filter(i -> isPrimeParallel(i))
.count();
}
// itself uses non-parallel stream and calls non-parallel variant isPrime
private static long calNumberOfPrimesNPNP(long n) {
return LongStream
.rangeClosed(2, n)
.filter(i -> isPrimeNonParallel(i))
.count();
}
// uses parallel stream
private static boolean isPrimeParallel(long n) {
return LongStream
.rangeClosed(2, (long) Math.sqrt(n))
.parallel()
.noneMatch(i -> n % i == 0);
}
// uses non-parallel stream
private static boolean isPrimeNonParallel(long n) {
return LongStream
.rangeClosed(2, (long) Math.sqrt(n))
.noneMatch(i -> n % i == 0);
}
I'm trying to reason out which amongst calNumberOfPrimesPP
, calNumberOfPrimesPNP
, calNumberOfPrimesNPP
and calNumberOfPrimesNPNP
is the best in terms of proper usage of parallel streams with efficiency and why it is the best.
I tried to time all these 4 methods in 50 times and took the average using the following code:
public static void main(String[] args) throws Exception {
int iterations = 50;
int n = 1000000;
double pp, pnp, npp, npnp;
pp = pnp = npp = npnp = 0;
for (int i = 0; i < iterations; i++) {
Callable<Long> runner1 = () -> calNumberOfPrimesPP(n);
Callable<Long> runner2 = () -> calNumberOfPrimesPNP(n);
Callable<Long> runner3 = () -> calNumberOfPrimesNPP(n);
Callable<Long> runner4 = () -> calNumberOfPrimesNPNP(n);
pp += TimeIt.timeIt(runner1);
pnp += TimeIt.timeIt(runner2);
npp += TimeIt.timeIt(runner3);
npnp += TimeIt.timeIt(runner4);
}
System.out.println("___________final results___________");
System.out.println("avg PP = " + pp / iterations);
System.out.println("avg PNP = " + pnp / iterations);
System.out.println("avg NPP = " + npp / iterations);
System.out.println("avg NPNP = " + npnp / iterations);
}
TimeIt.timeIt
simply returns the execution time in milli-seconds. I got the following output:
___________final results___________
avg PP = 2364.51336366
avg PNP = 265.27284506
avg NPP = 11424.194316620002
avg NPNP = 1138.15516624
Now I'm trying to reason about the above execution times:
PP
variant is not as fast as PNP
variant because all parallel streams use common fork-join thread pool and if we submit a long-running task, we are effectively blocking all threads in the pool. NPP
variant and so the NPP
variant should also be approximately as fast as the PNP
variant. (But this is not the case, NPP
variant is the worst in terms of time taken). Can someone please explain the reason behind this?My questions:
PNP
variant?NPP
variant is the worst (in terms of running time)?How TimeIt
is measuring time:
class TimeIt {
private TimeIt() {
}
/**
* returns the time to execute the Callable in milliseconds
*/
public static <T> double timeIt(Callable<T> callable) throws Exception {
long start = System.nanoTime();
System.out.println(callable.call());
return (System.nanoTime() - start) / 1.0e6;
}
}
PS: I understand that this is not the best method to count the number of primes. Sieve of Eratosthenes and other more sophisticated methods exists to do that. But by this example I just want to understand the behaviour of parallel streams and when to use them.
Parallel Stream It is a very useful feature of Java to use parallel processing, even if the whole program may not be parallelized. Parallel stream leverage multi-core processors, which increases its performance.
Similarly, don't use parallel if the stream is ordered and has much more elements than you want to process, e.g. This may run much longer because the parallel threads may work on plenty of number ranges instead of the crucial one 0-100, causing this to take very long time.
Usually, any Java code that has only one processing stream, where it is sequentially executed. However, by using parallel streams, one can separate the Java code into more than one stream, which is executed in parallel on their separate cores, and the end result is the combination of the individual results.
I think, it is clear, why NPP is so slow.
Arrange your resulting numbers in a table:
| _P | _NP
-------+----------+---------
P_ | 2364 | 265
-------+----------+---------
NP_ | 11424 | 1138
-------+----------+---------
So you see that it is always faster when the outer stream is parallel. This is because there is much work to be done in the stream. So the additional overhead for handling the parallel stream is low compared to the work to be done.
You see also that it is always faster when the inner stream is not parallel. isPrimeNonParallel
is faster than isPrimeParallel
. This is because there is not much work to be done in the stream. In most cases it is clear after a few steps that the number is not prime. Half of the numbers are even (only one step). The additional overhead for handling the parallel stream is high compared to the work to be done.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With