Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are Java 8 streams and lambdas deceiving? [closed]

I've been using the Java 8's lambdas and streams for a while because my master degree project and I noticed a few things that are not that broadly discussed on the internet. I'm using Netbeans to develop and many times it suggests to change the "old fashioned" style in favor of these two new constructors. But I wonder if these suggestions are in fact useful. The points are:

  • Legibility

Maybe a matter of habit, but if you use nested lambdas it can become a nightmare to understand what is happening.

  • Testability

Because of the Netbeans suggestions, we tend to change a for loop to a stream's foreach call, however there is a subtle but very dangerous side effect in testability. If your code fail inside the foreach block, the IDE(actually the compiler) simply doesn't know in which line the error occurred, pointing to the start of block. Also, debug the code is more difficult since we don't have control of the computations and inner loops.

  • performance

Again, the IDE always suggests to change an accumulation to a kind of map reduce algorithm. The latter looks more complex, so I created a simple test to check how good is that approach. Surprisinlgy, it was much slower!

that's the code:

public class Java8Kata {

public static void main(String[] args) {

    System.out.println("Generating random numbers...");
    final Collection<Number> numbers = getRandomNumbers();

    System.out.println("Starting comparison...");

    for (int i = 0; i < 20; i++) {
        getTotalConventionalStyle(numbers);
        getTotalNewStyle(numbers);
    }
}

public static void getTotalConventionalStyle(Collection<Number> numbers) {

    long startTime = System.nanoTime();
    System.out.println("\n\nstarting conventional...");
    double total = 0;
    for (Number number : numbers) {
        total += number.doubleValue();
    }
    System.out.println("total = " + total);

    System.out.println("finish conventional:" + getPeriod(startTime) + " seconds");
}

public static void getTotalNewStyle(Collection<Number> numbers) {

    long startTime = System.nanoTime();
    System.out.println("\n\nstarting new style ...");

    double total = 0;
    //netbeans conversion
    total = numbers.parallelStream().map((number) -> number.doubleValue()).reduce(total, (accumulator, _item) -> accumulator + _item);
    System.out.println("total = " + total);

    System.out.println("finish new style:" + getPeriod(startTime) + " seconds");
}

public static Collection<Number> getRandomNumbers() {

    Collection<Number> numbers = new ArrayList<>();

    for (long i = 0; i < 9999999; i++) {
        double randomInt = 9999999.0 * Math.random();
        numbers.add(randomInt);
    }
    return numbers;
}

public static String getPeriod(long startTime) {
    long time = System.nanoTime() - startTime;
    final double seconds = ((double) time / 1000000000);
    return new DecimalFormat("#.##########").format(seconds);
}

}

I've run the comparison 20 times just to assure the result were consistent.

Here they are:

Generating random numbers...
Starting comparison...


starting conventional...
total = 5.000187629072326E13
finish conventional:0.309586459 seconds


starting new style ...
total = 5.000187629073409E13
finish new style:20.862798586 seconds


starting conventional...
total = 5.000187629072326E13
finish conventional:0.316218488 seconds


starting new style ...
total = 5.000187629073409E13
finish new style:20.594838025 seconds

[...]

It wasn't my goal to do a deep performance test, I just wanted to see whether Netbeans was helping me or not.

As a conclusion, I can say you should use these new structures carefully, by your sole decision, instead of following IDE suggestions.

like image 742
Fábio Avatar asked Mar 17 '26 15:03

Fábio


2 Answers

Despite the clickbaity title, ("Are streams and lambdas deceiving?") I believe there are some real issues here.

If you're saying "don't blindly accept refactorings suggested by IDEs" then sure that makes sense. It could be that there are issues with NetBeans' refactorings if the resulting code is worse in some respects than the original. Then again, the IDE doesn't know what the programmer is doing, and presuming the programmer does know what he or she is doing, a refactoring that temporarily makes things worse isn't necessarily a bug.

On the specific points mentioned, broken down a bit more specifically:

  • Legibility. Yes, lambdas and streams can make things worse. But they can also make things much, much better. One can write bad code using any language and library constructs.

  • Compile-time Errors. These errors, particularly those relating to type inference, can be confusing. Usually I break the expression down into temporaries if I have trouble composing a long pipeline.

  • Testability. Any huge chunk of code nested within some structure is difficult to test. This includes long multi-line lambdas, which I avoid for this reason and others. Extracting a method is quite helpful here. An emerging style seems to favor stream pipelines composed of very simple lambdas or method references.

  • Debuggability. This can be confusing, and is possibly hampered by early issues with the IDEs' debuggers vs new language features, but I don't see this as a long-term problem. I've been able to single-step through multi-line lambdas with NetBeans 8, for example. I expect other IDEs to work comparably.

  • Performance. It's always a requirement for programmers to know what they're doing, and developing a mental model of performance is a necessity. Lambdas, streams, and parallelism, being new in Java 8 (only a few months old as of this writing) will take some time. Two quick points: 1) the cost of setting up a parallel pipeline is important, and it must be amortized over the processing of the stream elements. 2) Dealing with primitives is a bit of a bother, but you have to pay attention lest autoboxing and auto-unboxing kill your performance. That's clearly going on here.

  • Benchmarking. Use a real harness such as JMH instead of rolling your own. Incidentally, Aleksey Shipilev (JMH author) spoke at the JVM Language Summit yesterday on benchmarking, and particularly on the pitfalls of using nanoTime to measure the elapsed time. You will be startled by what problems you can run into using nanoTime.

Finally, I have to say, this is quite a lousy example. It certainly makes the performance of parallel streams and lambda look bad, but dkatzel (+1) has taken a swing at that one. Overall the code has a large number of issues. Adding random values into a Collection<Number> and then extracting double values? This is more a measure of boxing/unboxing than real computation. Coming to sensible conclusions about code is difficult in the first place, but if the code in question is bad to start with, the conclusions have no credibility. While summing numbers is a suspect benchmark to begin with, a reasonable approach would be to begin with a large array of double primitives and compare the code and performance of both conventional and streams-based code. That, however, will have to wait for another time.

like image 162
Stuart Marks Avatar answered Mar 19 '26 12:03

Stuart Marks


You aren't doing the correct new style summation

You want this:

total = numbers.parallelStream()
                .mapToDouble(number -> number.doubleValue())
                .sum();

This will make you Stream<Double> into a DoubleStream (kind of like a Stream<double> ) and then use the new sum() reduction which is an primitive summation, not an Object summation for much faster computation times.

This is also much easier to read.

When I run it on my machine with this simple code change I get this:

Generating random numbers...
Starting comparison...

finish conventional:  0.078106 seconds
finish new style:     0.279964 seconds


finish conventional: 0.126721 seconds
finish new style:    0.045977 seconds

 .... etc

Which is 100x faster than your method and basically as fast as the conventional method on average. There is a performance impact of running the new streams API. Think about all the background work required to run a multithreaded iteration and summation.

like image 42
dkatzel Avatar answered Mar 19 '26 14:03

dkatzel



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!