After reading about Java 8's java.util.stream.Intstream, I have been replacing some of the traditional loops with streams. Unfortunately, I ran into some performance issues when dealing with nested loops.
As expected, the following code runs in about 47 ms in my machine:
IntStream.range(0, 1000000000).forEach(i -> {});
However, nesting another IntStream hyper inflates the execution time to about 10,458 ms - i.e.:
IntStream.range(0, 1000000000).forEach(i -> {
IntStream.range(0, 1).forEach(j -> {});
});
Is this a case of misuse on my part, or is this an issue that may be resolved in the future?
EDIT: Just for comparison, the following code ran much faster (in 1,801 ms) using a traditional inner loop. So even when taking optimization into account, there seems to be more overhead using an inner IntStream?
final long[] random = {1};
IntStream.range(0, 1000000000).forEach(i -> {
for (int j = 0; j < 1; j++) {
random[0] += i;
}
});
Yes, streams are sometimes slower than loops, but they can also be equally fast; it depends on the circumstances. The point to take home is that sequential streams are no faster than loops.
There are a few special cases where streams are hard to apply, like loop over 2 or 3 collections simultaneously. In such case streams make not much sense, for is preferable. But in general there are no rules when to use or not to use specific construct.
In Java8 Streams, performance is achieved by parallelism, laziness, and using short-circuit operations, but there is a downside as well, and we need to be very cautious while choosing Streams, as it may degrade the performance of your application. Let us look at these factors which are meant for Streams' performance.
Remember that loops use an imperative style and Streams a declarative style, so Streams are likely to be much easier to maintain. If you have a small list, loops perform better. If you have a huge list, a parallel stream will perform better.
It's not the terrible performance in the second case. It's actually the unbelievably great performance in the first case. See, you iterate over one billion of elements, and the iteration takes only 47 ms. Thus in one second you're able to iterate over 1000/47 = 21 billion of elements! The frequency of your CPU is probably about 3 GHz, thus you iterate over 7 elements in single CPU cycle! Such optimization is performed by JIT-compiler for very simple loop (actually it's absolutely optimized out during the dead code elimination). However you won't earn money writing empty loops. If you add at least some of non-trivial logic, some of optimizations will turn off or become much less effective, so you will have a significant performance drop.
I suggest you to perform the testing on the real code and profile your application for the slowest parts. Artificial examples have nothing in common with the real performance of production code.
From the java doc:
void forEach(IntConsumer action) Performs an action for each element of this stream. This is a terminal operation.
Terminal operations, such as Stream.forEach or IntStream.sum, may traverse the stream to produce a result or a side-effect. After the terminal operation is performed, the stream pipeline is considered consumed, and can no longer be used; if you need to traverse the same data source again, you must return to the data source to get a new stream. In almost all cases, terminal operations are eager, completing their traversal of the data source and processing of the pipeline before returning. Only the terminal operations iterator() and spliterator() are not; these are provided as an "escape hatch" to enable arbitrary client-controlled pipeline traversals in the event that the existing operations are not sufficient to the task.
There is an overhead of creating lots of Streams. Have you tried to run the code with profiler?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With