Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java8 Lambda performance vs public functions

I've been demo testing a bit with lambda performance using Java8 VS. Java8 public functions.

The case is as the following:

  1. I have a list existing of 10 people (5 male and 5 female).

  2. I'd like to know which woman have an age between 18 and 25

Now, when I'm executing these steps a milion times, the results would be:

Lambda with ForEach took: 395 ms (396 ms using JUnit)

Public functions took: 173 ms (169 ms using JUnit)

Lambda with Collect took: 334 ms (335 ms using JUnit)

Now I didn't expect the execution time of lambda to be twice up to six times longer then regular functions.

So, now I'm pretty much wondering whether I've missed something here.

The source can be found here: pastebin.com/BJBk4Tu6


Follow up:

  1. When expanding the list to 1.000.000 items
  2. and filtering all young adult woman once

The results would be:

Lambda with ForEach took: 59 ms

Public functions took: 15 ms

Lambda with Collect took: 12 ms

However, when I'm trying to filter the same list existing of 1.000.000 people 100 times, the results would be:

Lambda with ForEach took: 227 ms

Public functions took: 134 ms

Lambda with Collect took: 172 ms

So, as a final conclusion: Lambdas are quicker when it comes to filtering larger lists while public function (the old way) are quicker at filtering smaller lists.

Also, public functions are quicker when it comes to filtering any lists multiple times, for whatever purpose you'd require to do that.

Latest code: pastebin.com/LcVhgnYv

like image 246
larssy1 Avatar asked Oct 08 '14 08:10

larssy1


1 Answers

As pointed out in the comments: You can hardly draw any conclusion from such a single, simple and isolated microbenchmark run.

Partially quoting from another (otherwise unrelated) answer:

In order to properly and reliably measure execution times, there exist several options. Apart from a profiler, like VisualVM, there are frameworks like JMH or Caliper, but admittedly, using them may be some effort.

For the simplest form of a very basic, manual Java Microbenchmark you have to consider the following:

  • Run the algorithms multiple times, to give the JIT a chance to kick in
  • Run the algorithms alternatingly and not only one after the other
  • Run the algorithms with increasing input size
  • Somehow save and print the results of the computation, to prevent the computation from being optimized away
  • Consider that timings may be distorted by the garbage collector (GC)

These are only rules of thumb, and there may still be unexpected results (refer to the links above for more details). But with this strategy, you usually obtain a good indication about the performance, and at least can see whether it's likely that there really are significant differences between the algorithms.

Related reading:

  • How do I write a correct micro-benchmark in Java?
  • Java theory and practice: Anatomy of a flawed microbenchmark
  • HotSpot Internals

I applied these basic steps to your program. Here is an MCVE :

NOTE: The remaining part was updated in response to the follow-up edit of the question)

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Random;
import java.util.stream.Collectors;

class Person {
    public static final int MALE = 0;
    public static final int FEMALE = 1;
    private final String name;
    private final int sex;
    private final int age;

    public Person(String name, int sex, int age) {
        this.name = name;
        this.sex = sex;
        this.age = age;
    }

    public int getSex() {
        return sex;
    }

    public int getAge() {
        return age;
    }
}

public class Main {

    public static void main(String[] args) {
        new Main();
    }

    private List<Person> people;

    public Main() {

        for (int size=10; size<=1000000; size*=10) {

            Random r = new Random(0);
            people = new ArrayList<Person>();
            for (int i = 0; i < size; i++) {
                int s = r.nextInt(2);
                int a = 25 + r.nextInt(20);
                people.add(new Person("p" + i, s, a));
            }

            int min = 10000000 / size;
            int max = 10 * min;
            for (int n = min; n <= max; n += min) {
                lambdaMethodUsingForEach(n);
                lambdaMethodUsingCollect(n);
                defaultMethod(n);
            }
        }
    }

    public void lambdaMethodUsingForEach(int n) {
        List<Person> lambdaOutput = new ArrayList<Person>();
        long lambdaStart = System.currentTimeMillis();
        for (int i = 0; i < n; i++) {
            lambdaOutput.addAll(getFemaleYoungAdultsUsingLambdaUsingForEach());
        }
        System.out.printf("List size: %10d, runs: %10d, result: %10d, ForEach took: " +
            (System.currentTimeMillis() - lambdaStart) + " ms\n",
            people.size(), n, lambdaOutput.size());
    }

    public void lambdaMethodUsingCollect(int n) {
        List<Person> lambdaOutput = new ArrayList<Person>();
        long lambdaStart = System.currentTimeMillis();
        for (int i = 0; i < n; i++) {
            lambdaOutput.addAll(getFemaleYoungAdultsUsingLambdaUsingCollect());
        }
        System.out.printf("List size: %10d, runs: %10d, result: %10d, collect took: " +
            (System.currentTimeMillis() - lambdaStart) + " ms\n",
            people.size(), n, lambdaOutput.size());
    }

    public void defaultMethod(int n) {
        List<Person> defaultOutput = new ArrayList<Person>();
        long defaultStart = System.currentTimeMillis();
        for (int i = 0; i < n; i++) {
            defaultOutput.addAll(getFemaleYoungAdultsUsingFunctions());
        }
        System.out.printf("List size: %10d, runs: %10d, result: %10d, default took: " +
            (System.currentTimeMillis() - defaultStart) + " ms\n",
            people.size(), n, defaultOutput.size());
    }

    public List<Person> getFemaleYoungAdultsUsingLambdaUsingForEach() {
        List<Person> people = new ArrayList<Person>();
        this.people.stream().filter(
                (p) -> p.getSex() == Person.FEMALE &&
                p.getAge() >= 18 &&
                p.getAge() <= 25).forEach(people::add);
        return people;
    }

    public List<Person> getFemaleYoungAdultsUsingLambdaUsingCollect() {
        return this.people.stream().filter(
                (p) -> p.getSex() == Person.FEMALE &&
                p.getAge() >= 18 &&
                p.getAge() <= 25).collect(Collectors.toList());
    }

    public List<Person> getFemaleYoungAdultsUsingFunctions() {
        List<Person> people = new ArrayList<Person>();
        for (Person p : this.people) {
            if (p.getSex() == Person.FEMALE && p.getAge() >= 18 && p.getAge() <= 25) {
                people.add(p);
            }
        }
        return people;
    }
}

The output on My Machine® is along the lines of this:

    ...
List size:       10, runs:   10000000, result:   10000000, ForEach took: 1482 ms
List size:       10, runs:   10000000, result:   10000000, collect took: 2014 ms
List size:       10, runs:   10000000, result:   10000000, default took: 1013 ms
...
List size:      100, runs:    1000000, result:    3000000, ForEach took: 664 ms
List size:      100, runs:    1000000, result:    3000000, collect took: 515 ms
List size:      100, runs:    1000000, result:    3000000, default took: 441 ms
...
List size:     1000, runs:     100000, result:    2300000, ForEach took: 778 ms
List size:     1000, runs:     100000, result:    2300000, collect took: 721 ms
List size:     1000, runs:     100000, result:    2300000, default took: 841 ms
...
List size:    10000, runs:      10000, result:    2450000, ForEach took: 970 ms
List size:    10000, runs:      10000, result:    2450000, collect took: 971 ms
List size:    10000, runs:      10000, result:    2450000, default took: 1119 ms
...
List size:   100000, runs:       1000, result:    2536000, ForEach took: 976 ms
List size:   100000, runs:       1000, result:    2536000, collect took: 1057 ms
List size:   100000, runs:       1000, result:    2536000, default took: 1109 ms
...
List size:  1000000, runs:        100, result:    2488600, ForEach took: 1323 ms
List size:  1000000, runs:        100, result:    2488600, collect took: 1305 ms
List size:  1000000, runs:        100, result:    2488600, default took: 1422 ms

You can see that the difference between the ForEach and the default (public methods) approach is vanishing even for smaller lists. For larger lists, there even seems to be a slight advantage for the lambda-based approaches.

To emphasize this again: This is a very simple microbenchmark, and even this does not necessarily tell much about the performance of these approaches in practice. However, it is at least reasonable to assume that the difference between the ForEach and the public methods is not as large as suggested by your initial test. Nevertleless: +1 from me for anybody who runs this in JMH or Caliper and posts some further insights about this.

like image 88
Marco13 Avatar answered Oct 11 '22 16:10

Marco13