Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Java's invokevirtual need to resolve the called method's compile-time class?

Consider this simple Java class:

class MyClass {
  public void bar(MyClass c) {
    c.foo();
  }
}

I want to discuss what happens on the line c.foo().

Original, Misleading Question

Note: Not all of this actually happens with each individual invokevirtual opcode. Hint: If you want to understand Java method invocation, don't read just the documentation for invokevirtual!

At the bytecode level, the meat of c.foo() will be the invokevirtual opcode, and, according to the documentation for invokevirtual, more or less the following will happen:

  1. Look up the foo method defined in compile-time class MyClass. (This involves first resolving MyClass.)
  2. Do some checks, including: Verify that c is not an initialization method, and verify that calling MyClass.foo wouldn't violate any protected modifiers.
  3. Figure out which method to actually call. In particular, look up c's runtime type. If that type has foo(), call that method and return. If not, look up c's runtime type's superclass; if that type has foo, call that method and return. If not, look up c's runtime type's superclass's superclass; if that type has foo, call that method and return. Etc.. If no suitable method can be found, then error.

Step #3 alone seems adequate for figuring out which method to call and verifying that said method has the correct argument/return types. So my question is why step #1 gets performed in the first place. Possible answers seem to be:

  • You don't have enough information to perform step #3 until step #1 is complete. (This seems implausible at first glance, so please explain.)
  • The linking or access modifier checks done in #1 and #2 are essential to prevent certain bad things from happening, and those checks must be performed based on the compile-time type, rather than the run-time type hierarchy. (Please explain.)

Revised Question

The core of the javac compiler output for the line c.foo() will be an instruction like this:

invokevirtual i

where i is an index to MyClass' runtime constant pool. That constant pool entry will be of type CONSTANT_Methodref_info, and will indicate (maybe indirectly) A) the name of the method called (i.e. foo), B) the method signature, and C) the name of compile time class that the method is called on (i.e. MyClass).

The question is, why is the reference to the compile-time type (MyClass) needed? Since invokevirtual is going to do dynamic dispatch on the runtime type of c, isn't it redundant to store the reference to the compile-time class?

like image 971
Chris Avatar asked Apr 01 '10 21:04

Chris


People also ask

What is Invokevirtual in Java?

invokevirtual dispatches a Java method. It is used in Java to invoke all methods except interface methods (which use invokeinterface), static methods (which use invokestatic), and the few special cases handled by invokespecial.

How does Invokedynamic work in Java?

The call site of the invokedynamic instruction is said to be unlaced at class loading time. The BSM is called to determine what method should actually be called, and the resulting CallSite object will then be laced into the call site.

How the Java virtual machine handles method invocation and return?

When the JVM invokes a method, it creates a stack frame of the proper size for that method. Adding a new frame onto the Java stack when a method is invoked is called "pushing" a stack frame; removing a frame when a method returns is called "popping" a stack frame. The Java stack is made up solely of these frames.

What is Invokedynamic?

A: invokedynamic is a bytecode instruction that facilitates the implementation of dynamic languages (for the JVM) through dynamic method invocation. This instruction is described in the Java SE 7 Edition of the JVM Specification.


2 Answers

It is all about performance. When by figuring out the compile-time type (aka: static type) the JVM can compute the index of the invoked method in the virtual function table of the runtime type (aka: dynamic type). Using this index step 3 simply becomes an access into an array which can be accomplished in constant time. No looping is needed.

Example:

class A {
   void foo() { }
   void bar() { }
}

class B extends A {
  void foo() { } // Overrides A.foo()
}

By default, A extends Object which defines these methods (final methods omitted as they are invoked via invokespecial):

class Object {
  public int hashCode() { ... }
  public boolean equals(Object o) { ... }
  public String toString() { ... }
  protected void finalize() { ... }
  protected Object clone() { ... }
}

Now, consider this invocation:

A x = ...;
x.foo();

By figuring out that x's static type is A the JVM can also figure out the list of methods that are available at this call site: hashCode, equals, toString, finalize, clone, foo, bar. In this list, foo is the 6th entry (hashCode is 1st, equals is 2nd, etc.). This calculation of the index is performed once - when the JVM loads the classfile.

After that, whenever the JVM processes x.foo() is just needs to access the 6th entry in the list of methods that x offers, equivalent to x.getClass().getMethods[5], (which points at A.foo() if x's dynamic type is A) and invoke that method. No need to exhaustively search this array of methods.

Note that the method's index, remains the same regardless of the dynamic type of x. That is: even if x points to an instance of B, the 6th methods is still foo (although this time it will point at B.foo()).

Update

[In light of your update]: You're right. In order to perform a virtual method dispatch all the JVM needs is the name+signature of the method (or the offset within the vtable). However, the JVM does not execute things blindly. It first checks that the cassfiles loaded into it are correct in a process called verification (see also here).

Verification expresses one of the design principles of the JVM: It does not rely on the compiler to produce correct code. It checks the code itself before it allows it to be executed. In particular, the verifier checks that every invoked virtual method is actually defined by the static type of the receiver object. Obviously, the static type of the receiver is needed to perform such a check.

like image 127
Itay Maman Avatar answered Sep 19 '22 11:09

Itay Maman


That's not the way I understand it after reading the documentation. I think you have steps 2 and 3 transposed, which would make the whole series of events more logical.

like image 30
Rob Heiser Avatar answered Sep 20 '22 11:09

Rob Heiser