I was recently playing with some benchmarks and found very interesting results that I can't explain right now. Here is the benchmark:
@BenchmarkMode(Mode.Throughput)
@Fork(1)
@State(Scope.Thread)
@Warmup(iterations = 10, time = 1, batchSize = 1000)
@Measurement(iterations = 10, time = 1, batchSize = 1000)
public class ArrayCopy {
@Param({"1","5","10","100", "1000"})
private int size;
private int[] ar;
@Setup
public void setup() {
ar = new int[size];
for (int i = 0; i < size; i++) {
ar[i] = i;
}
}
@Benchmark
public int[] SystemArrayCopy() {
final int length = size;
int[] result = new int[length];
System.arraycopy(ar, 0, result, 0, length);
return result;
}
@Benchmark
public int[] javaArrayCopy() {
final int length = size;
int[] result = new int[length];
for (int i = 0; i < length; i++) {
result[i] = ar[i];
}
return result;
}
@Benchmark
public int[] arraysCopyOf() {
final int length = size;
return Arrays.copyOf(ar, length);
}
}
Result:
Benchmark (size) Mode Cnt Score Error Units
ArrayCopy.SystemArrayCopy 1 thrpt 10 52533.503 ± 2938.553 ops/s
ArrayCopy.SystemArrayCopy 5 thrpt 10 52518.875 ± 4973.229 ops/s
ArrayCopy.SystemArrayCopy 10 thrpt 10 53527.400 ± 4291.669 ops/s
ArrayCopy.SystemArrayCopy 100 thrpt 10 18948.334 ± 929.156 ops/s
ArrayCopy.SystemArrayCopy 1000 thrpt 10 2782.739 ± 184.484 ops/s
ArrayCopy.arraysCopyOf 1 thrpt 10 111665.763 ± 8928.007 ops/s
ArrayCopy.arraysCopyOf 5 thrpt 10 97358.978 ± 5457.597 ops/s
ArrayCopy.arraysCopyOf 10 thrpt 10 93523.975 ± 9282.989 ops/s
ArrayCopy.arraysCopyOf 100 thrpt 10 19716.960 ± 728.051 ops/s
ArrayCopy.arraysCopyOf 1000 thrpt 10 1897.061 ± 242.788 ops/s
ArrayCopy.javaArrayCopy 1 thrpt 10 58053.872 ± 4955.749 ops/s
ArrayCopy.javaArrayCopy 5 thrpt 10 49708.647 ± 3579.826 ops/s
ArrayCopy.javaArrayCopy 10 thrpt 10 48111.857 ± 4603.024 ops/s
ArrayCopy.javaArrayCopy 100 thrpt 10 18768.866 ± 445.238 ops/s
ArrayCopy.javaArrayCopy 1000 thrpt 10 2462.207 ± 126.549 ops/s
So there are two strange things here:
Arrays.copyOf
is 2 times faster than System.arraycopy
for small
arrays (1,5,10 size). However, on a large array of size 1000
Arrays.copyOf
becomes almost 2 times slower. I know that both
methods are intrinsics, so I would expect the same performance. Where
does this difference come from? System.arraycopy
. It's not clear to me why. Does anybody know?VM version: JDK 1.8.0_131, VM 25.131-b11
Your System.arraycopy
benchmark is not semantically equivalent to Arrays.copyOf
.
It will be if you replace
System.arraycopy(ar, 0, result, 0, length);
with
System.arraycopy(ar, 0, result, 0, Math.min(ar.length, length));
With this change, the performance of both benchmarks will also become similar.
Why is the first variant slower then?
Without knowing how length
relates to ar.length
JVM needs to perform additional bounds check and be prepared to throw IndexOutOfBoundsException
when length > ar.length
.
This also breaks the optimization to eliminate redundant zeroing. You know, every allocated array must be initialized with zeros. However, JIT can avoid zeroing if it sees that the array is filled right after creation. But -prof perfasm
clearly shows that the original System.arraycopy
benchmark spends a significant amount of time clearing the allocated array:
0,84% 0x000000000365d35f: shr $0x3,%rcx
0,06% 0x000000000365d363: add $0xfffffffffffffffe,%rcx
0,69% 0x000000000365d367: xor %rax,%rax
0x000000000365d36a: shl $0x3,%rcx
21,02% 0x000000000365d36e: rep rex.W stos %al,%es:(%rdi) ;*newarray
Manual copy appeared faster for small arrays, because unlike System.arraycopy
it does not perform any runtime calls to VM functions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With