I am currently working on a tool that produces Java byte code directly. When considering performance, I could for example translate the following Java equivalent
int[] val = new int[1];
val[0] = 1 + 1;
in two different ways. Without applying optimizations, I could translate the stated code into the following byte code equivalent
ICONST_1
ANEWARRAY I
DUP
ICONST_1
ICONST_1
IADD
AASTORE
where the array would in the end be the only value on the operand stack. This translation requires an operand stack size of 4 since there is a maximum of four values (int[], int[], int, int)
on the stack but this solution would not consume any local variable slots.
I could however also translate the code snippet like this:
ICONST_1
ICONST_1
IADD
ISTORE_1
ICONST_1
ANEWARRAY I
DUP
ILOAD_1
AASTORE
This translation would reduce the size requirement for the operand stack by one slot but instead it would cost a local variable slot for storing the intermediate result.
Not thinking of the reusability of such slots and local variables within a method: How should I consider the costs of the operand stack versus the local variable array? Is operand stack size cheaper since it is not random access memory?
Thank you for your help!
The question has no general answer.
In interpreter mode, local variables and stack depth are probably interchangeable in terms of performance, but its of course up the the interpreters implementation.
In JIT mode, it depends majorly on the target architecture. If the target CPU uses a register file programming model (lets say x64/86 or PPC) there will probably not be any operand stack at all in the resulting machine code - it would have been transformed into register mapping (competing with local variable for the same register set). If its a stack oriented architecture (Sparc), the operand stack should be very fast anyway - after all, its built around the stack.
You will only get a definitve answer if you take a look at the JIT'd code for a particular byte code sequence. And the code could change with each VM version. Its probably a waste of time to worry about optimizing your byte code this way.
Make your byte code come out using the same idioms javac uses. That way, you have the opportunity that the JIT will recognize the idiom and optimize it with a special code path handcrafted into it for that javac idiom.
The short answer is that you don't.
The Hotspot JVM uses a just in time compiler, so if your code is simple and executed a lot, it will get optimized and compiled at runtime. Therefore, trivial stuff like this is unlikely to matter, and if it does matter, it may not be in the way you expect.
Your best bet is to figure out if you actually have a performance problem and if so to try some profiling.
P.S. Longs and doubles do take two slots on the operand stack as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With