Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use JMH properly? Example with ArrayList

In my examples theoretically performance of 2 methods should be pretty similar. In the first case I use array, at the second - ArrayList with ensured capacity.

The results is the following:

LessonBenchmark2.capacityTestArray avgt 5 1,354 ± 0,057 ms/op

LessonBenchmark2.capacityTestArrayListEnsured avgt 5 32,018 ± 81,911 ms/op

Here it seems that array is much faster (1.354 vs 32.018 ms/op). It might be that the settings of my benchmark with JMH is not correct. How to make it right?

Also if I use @Setup(Level.Invocation), then the results are close (1,405 vs 1,496 ms/op):

LessonBenchmark.capacityTestArray avgt 5 1,405 ± 0,143 ms/op

LessonBenchmark.capacityTestArrayListEnsured avgt 5 1,496 ± 0,104 ms/op

However it is said to use Invocation with care. Also Iteration mode seems logically right.

Here is the code:

public static void main(String[] args) throws Exception {
    org.openjdk.jmh.Main.main(args);
}

static final int iter = 5;
static final int fork = 1;
static final int warmIter = 5;

@State(Scope.Benchmark)
public static class Params {
    public int length = 100_000;
    public Person[] people;
    public ArrayList<Person> peopleArrayListEnsure;

    // before each iteration of the benchmark
    @Setup(Level.Iteration)
    public void setup() {
        people = new Person[length];
        peopleArrayListEnsure = new ArrayList<>(length);
    }
}

@Benchmark
@Warmup(iterations = warmIter)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Fork(value = fork)
@Measurement(iterations = iter)
public void capacityTestArray(Params p) {
    for (int i = 0; i < p.length; i++) {
        p.people[i] = new Person(i, new Address(i, i), new Pet(i, i));
    }
}

@Benchmark
@Warmup(iterations = warmIter)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Fork(value = fork)
@Measurement(iterations = iter)
public void capacityTestArrayListEnsured(Params p) {
    for (int i = 0; i < p.length; i++) {
        p.peopleArrayListEnsure.add(new Person(i, new Address(i, i), new Pet(i, i)));
    }
}

public static class Person {
    private int id;
    private Address address;
    private Pet pet;

    public Person(int id, Address address, Pet pet) {
        this.id = id;
        this.address = address;
        this.pet = pet;
    }
}

public static class Address {
    private int countryId;
    private int cityId;

    public Address(int countryId, int cityId) {
        this.countryId = countryId;
        this.cityId = cityId;
    }
}

public static class Pet {
    private int age;
    private int typeId;

    public Pet(int age, int typeId) {
        this.age = age;
        this.typeId = typeId;
    }
}
like image 657
Kirill Ch Avatar asked Feb 17 '21 13:02

Kirill Ch


People also ask

What is the use of JMH?

JMH is a toolkit that helps you implement Java microbenchmarks correctly. JMH is developed by the same people who implement the Java virtual machine, so these guys know what they are doing. This JMH tutorial will teach you how to implement and run Java microbenchmarks with JMH.

What are the different values of the JMH method parameter?

There are three different values this parameter can take. The value you set instruct JMH about when the method should be called. The possible values are: The method is called once for each time for each full run of the benchmark. A full run means a full "fork" including all warmup and benchmark iterations.

How do I get Started with JMH?

The easiest way to get started with JMH is to generate a new JMH project using the JMH Maven archetype. The JMH Maven archetype will generate a new Java project with a single, example benchmark Java class, and a Maven pom.xml file.

What is the use of an array list in Java?

ArrayList is implemented on behavior to add the elements or values at the end of the list. For example, if you are working with huge volume data and after adding the values to ArrayList if you do not want to remove the values or add new values in the between the exiting values then you are good to use the ArrayList in such scenario's.


3 Answers

The test is badly designed; in your test, because the arraylist is created only once for multiple invocations, the array-based code just overwrites the same array a bunch of times, whereas the arraylist version adds more and more, and needs to grow.

One trivial fix is to clear it first. Another fix is to stop using state here and just make the creation of the object (be it the 100k person array, or the person arraylist, presized for 100k persons) part of the test harness. Once you take care of this, the results are the exact same taking into account the error, there is no performance different at all between arrays and arraylists for this.

MyBenchmark.capacityTestArray             avgt    5  1,325 ± 0,059  ms/op
MyBenchmark.capacityTestArrayListEnsured  avgt    5  1,287 ± 0,157  ms/op

I simplified by removing the Params state entirely, and making the creation of the list and array part of each test's outlay:

    static final int LEN = 100_000;
    
    public void capacityTestArray() {
        Person[] people = new Person[LEN];
        for (int i = 0; i < LEN; i++) {
            people[i] = new Person(i, new Address(i, i), new Pet(i, i));
        }
    }

    public void capacityTestArrayListEnsured() {
        List<Person> p = new ArrayList<Person>(LEN);
        for (int i = 0; i < LEN; i++) {
            p.add(new Person(i, new Address(i, i), new Pet(i, i)));
        }
    }

(keeping all annotations and the Person, Address, etc classes the same).

Alternatively, take your existing code and just toss a list.clear() at the top.

like image 78
rzwitserloot Avatar answered Oct 24 '22 10:10

rzwitserloot


As soon as you understand the difference between Trial, Iteration and Invocation, your question becomes very easy to answer. And what place to better understand these then the samples themselves.

Invocation is the a single execution of the method. Let's say there are 3 threads and each execute this benchmark method 100 times. This means Invocation == 300. That is why you get very similar results using this as the set-up.

Iteration would be 3 from the example above.

Trial would be 1, when all the threads execute all their methods.

Invocation, though has a scary documentation has its usage, like a sorted data structure; but I've used in various other places too. Also the notion of operation can be "altered" with @OperationsPerInvocation - which is another sharp tool.


Armed with this - it gets easy to answer. When you use Iteration, your ArrayList will grow constantly - which internally means System::arrayCopy, while your array does not.

Once you figure this out, you need to read the samples and see that your second problem is that your @Benchmark methods return void. And, contrary, to the other answer - I would not suggest to bulk everything with the test method itself, but this raises the question on what do you want to test, to begin with. Do not forget that these are just numbers, in the end, you need to reason about what they mean and how to properly set-up a JMH test.

like image 44
Eugene Avatar answered Oct 24 '22 10:10

Eugene


Even if initially thought it was a natural performance difference, below's comment were right


As commented below, the difference is indeed higher than expected.

The only scenario in which the add() goes from O(1) to O(n) is if it grows. May it be that the tests are reusing the same arraylist (as result of setup not being called more than once)? This would only affect to the arraylist test, as the array would just override the values.

Just to be sure the arraylist isn't growing:

public void capacityTestArrayListEnsured(Params p) 
{
    p.peopleArrayListEnsure = new ArrayList<>(p.length); //or clear()?
    for (int i = 0; i < p.length; i++) 
        p.peopleArrayListEnsure.add(new Person(i, new Address(i, i), new Pet(i, i)));
}

In order to make it fair, you could also initialize the array in the other method so the elapsed times are equally added:

public void capacityTestArray(Params p)  
{
    p.people = new Person[p.length];
    for (int i = 0; i < p.length; i++) 
        p.people[i] = new Person(i, new Address(i, i), new Pet(i, i));
}
like image 1
aran Avatar answered Oct 24 '22 11:10

aran