Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does the JVM have both `invokespecial` and `invokestatic` opcodes?

Tags:

jvm

bytecode

Both instructions use static rather than dynamic dispatch. It seems like the only substantial difference is that invokespecial will always have, as its first argument, an object that is an instance of the class that the dispatched method belongs to. However, invokespecial does not actually put the object there; the compiler is the one responsible for making that happen by emitting the appropriate sequence of stack operations before emitting invokespecial. So replacing invokespecial with invokestatic should not affect the way the runtime stack / heap gets manipulated -- though I expect that it will cause a VerifyError for violating the spec.

I'm curious about the possible reasons behind making two distinct instructions that do essentially the same thing. I took a look at the source of the OpenJDK interpreter, and it seems like invokespecial and invokestatic are handled almost identically. Does having two separate instructions help the JIT compiler better optimize code, or does it help the classfile verifier prove some safety properties more efficiently? Or is this just a quirk in the JVM's design?

like image 352
int3 Avatar asked Dec 20 '12 01:12

int3


1 Answers

Disclaimer: It is hard to tell for sure since I never read an explicit Oracle statement about this, but I pretty much think this is the reason:

When you look at Java byte code, you could ask the same question about other instructions. Why would the verifier stop you when pushing two ints on the stack and treating them as a single long right after? (Try it, it will stop you.) You could argue that by allowing this, you could express the same logic with a smaller instruction set. (To go further with this argument, a byte cannot express too many instructions, the Java byte code set should therefore cut down wherever possible.)

Of course, in theory you would not need a byte code instruction for pushing ints and longs to the stack and you are right about the fact that you would not need two instructions for INVOKESPECIAL and INVOKESTATIC in order to express method invocations. A method is uniquely identified by its method descriptor (name and raw argument types) and you could not define both a static and a non-static method with an identical description within the same class. And in order to validate the byte code, the Java compiler must check whether the target method is static nevertheless.

Remark: This contradicts the answer of v6ak. However, a methods descriptor of a non-static method is not altered to include a reference to this.getClass(). The Java runtime could therefore always infer the appropriate method binding from the method descriptor for a hypothetical INVOKESMART instruction. See JVMS §4.3.3.

So much for the theory. However, the intentions that are expressed by both invocation types are quite different. And remember that Java byte code is supposed to be used by other tools than javac to create JVM applications, as well. With byte code, these tools produce something that is more similar to machine code than your Java source code. But it is still rather high level. For example, byte code still is verified and the byte code is automatically optimized when compiled to machine code. However, the byte code is an abstraction that intentionally contains some redundancy in order to make the meaning of the byte code more explicit. And just like the Java language uses different names for similar things to make the language more readable, the byte code instruction set contains some redundancy as well. And as another benefit, verification and byte code interpretation/compilation can speed up since a method's invocation type does not always need to be inferred but is explicitly stated in the byte code. This is desirable because verification, interpretation and compilation are done at runtime.

As a final anecdote, I should mention that a class's static initializer <clinit> was not flagged static before Java 5. In this context, the static invocation could also be inferred by the method's name but this would cause even more run time overhead.

like image 199
Rafael Winterhalter Avatar answered Sep 28 '22 06:09

Rafael Winterhalter