Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java8 streams strange behavior

I was benchmarking some Java8 Streams API snippets, but I could not figure out what is happening with this one.

I was thinking about ParallelStream and how it actually works and trying to make some comparisons between sequential and parallel processing. I created two different methods, both doing a huge iteration while adding 32.768.000 BigDecimals, one of them using ParallelStream, and the other using normal sequential iteration. I ended with a test I know is not valid, but some points caught my attention.

The methods are:

private static void sumWithParallelStream() {
    BigDecimal[] list = new BigDecimal[32_768_000];
    BigDecimal total = BigDecimal.ZERO;
    for (int i = 0; i < 32_768_000; i++) {
        list[i] = new BigDecimal(i);
    }
    total = Arrays.asList(list).parallelStream().reduce(BigDecimal.ZERO, BigDecimal::add);
    System.out.println("Total: " + total);
}

private static void sequenceSum() {
    BigDecimal total = BigDecimal.ZERO;
    for (int i = 0; i < 32_768_000; i++) {
        total = total.add(new BigDecimal(i));
    }
    System.out.println("Total: " + total);
}

The output was:

Total: 536870895616000
sumWithParallelStream(): 30502 ms

Total: 536870895616000
sequenceSum(): 271 ms

Then I tried removing the ParallelStream and check it's real impact:

 private static void sumWithParallelStream() {
    BigDecimal[] list = new BigDecimal[32_768_000];
    BigDecimal total = BigDecimal.ZERO;
    for (int i = 0; i < 32_768_000; i++) {
        list[i] = new BigDecimal(i);
        total = total.add(list[i]);
    }
    System.out.println("Total: " + total);
}

Pay attention the sequenceSum() method remains unchanged

And surprisingly, the new output was:

Total: 536870895616000
sumWithParallelStream(): 13487 ms

Total: 536870895616000
sequenceSum(): 879 ms

I repeated these changes several times, adding and removing the parallelStream call, and the results of sequenceSum() are consistent, ~200ms when parallelStream is involved, ~800ms when not. Tested in different machines, Windows and Ubuntu.

Finally, my two questions are:

  1. Why does the usage of parallelStream on the first method interfere with the second one?
  2. Why did store the BigDecimal instances in the array made the first method much slower (800ms to 13000ms)?
like image 708
Jean Jung Avatar asked Oct 19 '22 08:10

Jean Jung


1 Answers

In the first example you are allocating an array of 32,768,000 elements then streaming over it. That array allocation and memory fetching isn't needed and is probably what's slowing the method down.

IntStream.range(0, limit).parallel()
   .mapToObj(BigDecimal::new)
   .reduce(BigDecimal.ZERO, BigDecimal::add);
like image 163
dolan Avatar answered Oct 22 '22 00:10

dolan