I tried to optimize the RAM usage of a Android game by changing int primitives to shorts. Before I did this I was interested in the performance of the primitive types in Java.
So I created this little test benchmark using the caliper library.
public class BenchmarkTypes extends Benchmark {
@Param("10") private long testLong;
@Param("10") private int testInt;
@Param("10") private short testShort;
@Param("5000") private long resultLong = 5000;
@Param("5000") private int resultInt = 5000;
@Param("5000") private short resultShort = 5000;
@Override
protected void setUp() throws Exception {
Random rand = new Random();
testShort = (short) rand.nextInt(1000);
testInt = (int) testShort;
testLong = (long) testShort;
}
public long timeLong(int reps){
for(int i = 0; i < reps; i++){
resultLong += testLong;
resultLong -= testLong;
}
return resultLong;
}
public int timeInt(int reps){
for(int i = 0; i < reps; i++){
resultInt += testInt;
resultInt -= testInt;
}
return resultInt;
}
public short timeShort(int reps){
for(int i = 0; i < reps; i++){
resultShort += testShort;
resultShort -= testShort;
}
return resultShort;
}
}
The results of the test surprised me.
Test circumstances
Benchmark run under the Caliper library.
Test results
https://microbenchmarks.appspot.com/runs/0c9bd212-feeb-4f8f-896c-e027b85dfe3b
Int 2.365 ns
Long 2.436 ns
Short 8.156 ns
Test conclusion?
The short primitive type is significantly slower (3-4~ times) than the long and int primitive type?
Question
Why is the short primitive significantly slower than int or long? I would expect the int primitive type to be the fastest on a 32bit VM and the long and short to be equal in time or the short to be even faster.
Is this also the case on Android phones? Knowing that Android phones in general run in a 32bit environment and now the days more and more phones start to ship with 64bit processors.
Java byte code does not support basic operations (+, -, *, /, >>,>>>, <<, %) on primitive types smaller than int. There are simply no byte codes allocated for such operations in the instruction set. Thus the VM needs to convert the short(s) to int(s), performs the operation, then truncates the int back to short and stores that in the result.
Check out the generated byte code with javap to see the difference between your short and int tests.
The VM/JIT optimizations are apparently heavily biased towards int/long operations, which makes sense since they are the most common.
Types smaller than int have their uses, but primarily for saving memory in arrays. They are not as well suited as simple class members (of course you still do use them when its the appropiate type for the data). Smaller members may not even reduce an objects size. Current VM's are (again) mainly tailored for execution speed, so the VM may even align fields to native machine word boundaries to increase access performance at the expense of memory spend.
It is possible due to the way java/android handles integer arithmetics with regard to primitives that are lesser than an int.
When two primitives are added in java that are of a datatype that is smaller than an int, they are automatically promoted to the integer datatype. A cast is normally required to convert the result back into the necessary datatype.
The trick comes with shorthand operations like +=
, -=
and so on where the cast happens implicitly such that the final result of the operation:
resultShort += testShort;
actually resembles something like this:
resultShort = (short)((int) resultShort + (int) testShort);
If we look at the disassembled bytecode of a method:
public static int test(int a, int b){
a += b;
return a;
}
we see:
public static int test(int, int);
Code:
0: iload_0
1: iload_1
2: iadd
3: istore_0
4: iload_0
5: ireturn
comparing this to the identical method with datatype replaced for short we get:
public static short test(short, short);
Code:
0: iload_0
1: iload_1
2: iadd
3: i2s
4: istore_0
5: iload_0
6: ireturn
Notice the additional instruction i2s
(integer to short). This is the likely culprit of the loss of performance. Another thing you can notice is that all instructions are integer-based denoted by the prefix i
(e.g. iadd
meaning integer-add). Which means somewhere during the iload
phase, the shorts were getting promoted to integers which is likely to cause performance degradations as well.
If you can take my word for it, the bytecode for long arithmetics is identical to the integer one with exception that the instructions are long-specific (e.g. ladd
instead of iadd
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With