In my examples theoretically performance of 2 methods should be pretty similar. In the first case I use array, at the second - ArrayList with ensured capacity.
The results is the following:
LessonBenchmark2.capacityTestArray avgt 5 1,354 ± 0,057 ms/op
LessonBenchmark2.capacityTestArrayListEnsured avgt 5 32,018 ± 81,911 ms/op
Here it seems that array is much faster (1.354 vs 32.018 ms/op). It might be that the settings of my benchmark with JMH is not correct. How to make it right?
Also if I use @Setup(Level.Invocation), then the results are close (1,405 vs 1,496 ms/op):
LessonBenchmark.capacityTestArray avgt 5 1,405 ± 0,143 ms/op
LessonBenchmark.capacityTestArrayListEnsured avgt 5 1,496 ± 0,104 ms/op
However it is said to use Invocation with care. Also Iteration mode seems logically right.
Here is the code:
public static void main(String[] args) throws Exception {
org.openjdk.jmh.Main.main(args);
}
static final int iter = 5;
static final int fork = 1;
static final int warmIter = 5;
@State(Scope.Benchmark)
public static class Params {
public int length = 100_000;
public Person[] people;
public ArrayList<Person> peopleArrayListEnsure;
// before each iteration of the benchmark
@Setup(Level.Iteration)
public void setup() {
people = new Person[length];
peopleArrayListEnsure = new ArrayList<>(length);
}
}
@Benchmark
@Warmup(iterations = warmIter)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Fork(value = fork)
@Measurement(iterations = iter)
public void capacityTestArray(Params p) {
for (int i = 0; i < p.length; i++) {
p.people[i] = new Person(i, new Address(i, i), new Pet(i, i));
}
}
@Benchmark
@Warmup(iterations = warmIter)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Fork(value = fork)
@Measurement(iterations = iter)
public void capacityTestArrayListEnsured(Params p) {
for (int i = 0; i < p.length; i++) {
p.peopleArrayListEnsure.add(new Person(i, new Address(i, i), new Pet(i, i)));
}
}
public static class Person {
private int id;
private Address address;
private Pet pet;
public Person(int id, Address address, Pet pet) {
this.id = id;
this.address = address;
this.pet = pet;
}
}
public static class Address {
private int countryId;
private int cityId;
public Address(int countryId, int cityId) {
this.countryId = countryId;
this.cityId = cityId;
}
}
public static class Pet {
private int age;
private int typeId;
public Pet(int age, int typeId) {
this.age = age;
this.typeId = typeId;
}
}
JMH is a toolkit that helps you implement Java microbenchmarks correctly. JMH is developed by the same people who implement the Java virtual machine, so these guys know what they are doing. This JMH tutorial will teach you how to implement and run Java microbenchmarks with JMH.
There are three different values this parameter can take. The value you set instruct JMH about when the method should be called. The possible values are: The method is called once for each time for each full run of the benchmark. A full run means a full "fork" including all warmup and benchmark iterations.
The easiest way to get started with JMH is to generate a new JMH project using the JMH Maven archetype. The JMH Maven archetype will generate a new Java project with a single, example benchmark Java class, and a Maven pom.xml file.
ArrayList is implemented on behavior to add the elements or values at the end of the list. For example, if you are working with huge volume data and after adding the values to ArrayList if you do not want to remove the values or add new values in the between the exiting values then you are good to use the ArrayList in such scenario's.
The test is badly designed; in your test, because the arraylist is created only once for multiple invocations, the array-based code just overwrites the same array a bunch of times, whereas the arraylist version adds more and more, and needs to grow.
One trivial fix is to clear it first. Another fix is to stop using state here and just make the creation of the object (be it the 100k person array, or the person arraylist, presized for 100k persons) part of the test harness. Once you take care of this, the results are the exact same taking into account the error, there is no performance different at all between arrays and arraylists for this.
MyBenchmark.capacityTestArray avgt 5 1,325 ± 0,059 ms/op
MyBenchmark.capacityTestArrayListEnsured avgt 5 1,287 ± 0,157 ms/op
I simplified by removing the Params
state entirely, and making the creation of the list and array part of each test's outlay:
static final int LEN = 100_000;
public void capacityTestArray() {
Person[] people = new Person[LEN];
for (int i = 0; i < LEN; i++) {
people[i] = new Person(i, new Address(i, i), new Pet(i, i));
}
}
public void capacityTestArrayListEnsured() {
List<Person> p = new ArrayList<Person>(LEN);
for (int i = 0; i < LEN; i++) {
p.add(new Person(i, new Address(i, i), new Pet(i, i)));
}
}
(keeping all annotations and the Person
, Address
, etc classes the same).
Alternatively, take your existing code and just toss a list.clear()
at the top.
As soon as you understand the difference between Trial
, Iteration
and Invocation
, your question becomes very easy to answer. And what place to better understand these then the samples themselves.
Invocation
is the a single execution of the method. Let's say there are 3 threads and each execute this benchmark method 100 times. This means Invocation == 300
. That is why you get very similar results using this as the set-up.
Iteration
would be 3
from the example above.
Trial
would be 1
, when all the threads execute all their methods.
Invocation
, though has a scary documentation has its usage, like a sorted data structure; but I've used in various other places too. Also the notion of operation
can be "altered" with @OperationsPerInvocation
- which is another sharp tool.
Armed with this - it gets easy to answer. When you use Iteration
, your ArrayList
will grow constantly - which internally means System::arrayCopy
, while your array does not.
Once you figure this out, you need to read the samples and see that your second problem is that your @Benchmark
methods return void
. And, contrary, to the other answer - I would not suggest to bulk everything with the test method itself, but this raises the question on what do you want to test, to begin with. Do not forget that these are just numbers, in the end, you need to reason about what they mean and how to properly set-up a JMH
test.
Even if initially thought it was a natural performance difference, below's comment were right
As commented below, the difference is indeed higher than expected.
The only scenario in which the add()
goes from O(1)
to O(n)
is if it grows. May it be that the tests are reusing the same arraylist (as result of setup not being called more than once)? This would only affect to the arraylist test, as the array would just override the values.
Just to be sure the arraylist isn't growing:
public void capacityTestArrayListEnsured(Params p)
{
p.peopleArrayListEnsure = new ArrayList<>(p.length); //or clear()?
for (int i = 0; i < p.length; i++)
p.peopleArrayListEnsure.add(new Person(i, new Address(i, i), new Pet(i, i)));
}
In order to make it fair, you could also initialize the array in the other method so the elapsed times are equally added:
public void capacityTestArray(Params p)
{
p.people = new Person[p.length];
for (int i = 0; i < p.length; i++)
p.people[i] = new Person(i, new Address(i, i), new Pet(i, i));
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With