Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where does the resolution stage of Java class loading actually start?

Tags:

I just finished reading through the Java Virtual Machine Specification and the section on class loading left me puzzled. As far as I understood it in general and after reading the specification, I thought that the overall instantiation of a class consisted of the following steps in the following order:

  • Creation / Loading: The class loader locates a stream of bytes representing the class, either a file or a network stream or whatever one implements a class loader to fetch. If no class can be found, a ClassNotFoundException is thrown. At this point, there is already some basic validation happening where a ClassFormatError is thrown if the byte array does not represent a Java class (for example, the magic number is missing) or an UnsupportedClassVersionError if the class version is not supported by the running JVM instance.

  • Linking: The class is hooked into the JVM. If something goes wrong, a subclass of LinkageError is thrown. Linking consists of the three substeps:

    • Verification: It is made sure that the byte stream represents a Java class as for example that the byte code without formal errors such as overflowing operand stacks for the method byte code. If a class fails verification, a VerifyError is thrown.

    • Preparation: The JVM allocates memory for all static fields and might create an instance template for speeding up instance creation. Virtual method tables are created. No class loading specific errors are be thrown on this stage. (An OutOfMemoryError might be thrown though.)

    • Resolution: All symbolic references that were now loaded into the method area in form of the runtime constant pool are resolved to actual types loaded by this JVM. If a symbolic reference can be resolved but results in a conflict of definitions, a IncompatibleClassChangeError is thrown. If a referenced class cannot be found, a NoClassDefFoundError is thrown which basically wraps a ClassNotFoundException that was thrown by the class loader attempting to load this referenced class. If a referenced class references itself, a ClassCircularityError is thrown. Resolution can happen in one of two flavors which is up to the implementors of the JVM

      1. Eager: All symbolic references to other fields, methods or classes are resolved right now.

      2. Lazy: Resolving of symbolic references is postponed until the first use of a method. This might bring with it that a class refering to a non-existant class never throws an error if this reference never needs to be resolved.

  • Initialization: The class's static initializers that are defined in the class as Java code are run. If an exception is caused by such an initializer, this exception is rethrown wrapped in an ExceptionInInitializerError.

What puzzles me is the resolution phase of the above class loading mechanism. Why is resolution defined as an explicit step within linking which occurs specifically after preparation? Already within the description of the class loading phase, it is mentioned that

If C has any direct superinterfaces, the symbolic references from C to its direct superinterfaces are resolved using the algorithm of §5.4.3.1.

Are symbolic references not also resolved while verification occurs since verification is described:

Verification (§4.10) ensures that the binary representation of a class or interface is structurally correct (§4.9). Verification may cause additional classes and interfaces to be loaded (§5.3) but need not cause them to be verified or prepared.

I always have this picture in mind

Java class loading overview

Source: http://www.programcreek.com

which I have seen at almost any place explaining class loading. Should resolution not rather be seen as an overall responsibility which is part of all phases, creation/loading, verification, linking and initialization (since resolution can be done lazily).

Currently, I would argue that it would make sense to take the resolution stage out of this image and declare it a general procedure that can be used at any time since information about other classes might be required at any stage such that the loading of such a class is required what necessarily also requires the resolution of the symbolic reference to this class. From the shown picture, it looks like resolution is only happening at a specific point in a chain of separate events.

I suspect that this depiction of resolution being a dedicated step maybe just legacy from a time where resolution was never conducted lazily but had its place where all remaining symbolic references were resolved.

What I want to know: Should resolution in today's JVMs be understood the way I described it? Or am I wrong about this and resolution can still be understood as a dedicated step in a fixed time line just as the image shows it?

like image 560
Rafael Winterhalter Avatar asked Dec 03 '13 09:12

Rafael Winterhalter


People also ask

What is the order of class loading in Java?

Note: The ClassLoader Delegation Hierarchy Model always functions in the order Application ClassLoader->Extension ClassLoader->Bootstrap ClassLoader. The Bootstrap ClassLoader is always given the higher priority, next is Extension ClassLoader and then Application ClassLoader.

How does class loading work in Java?

A Java Class is stored in the form of byte code in a . class file after it is compiled. The ClassLoader loads the class of the Java program into memory when it is required. The ClassLoader is hierarchical and so if there is a request to load a class, it is delegated to the parent class loader.

Which class will load first in Java?

The Java Virtual Machine starts up by creating an initial class or interface using the bootstrap class loader (§5.3. 1) or a user-defined class loader (§5.3. 2). The Java Virtual Machine then links the initial class or interface, initializes it, and invokes the public static method void main(String[]) .

What is resolution in JVM?

Preparation: JVM allocates memory for class variables and initializing the memory to default values. Resolution: It is the process of replacing symbolic references from the type with direct references. It is done by searching into the method area to locate the referenced entity.


1 Answers

Your picture shows resolving as always appearing after preparation but that won’t work. The direct super classes are needed for preparation as you need knowledge about the instance fields of the super classes to determine the object instance memory layout for a particular class. Further, the static initializers of a class and it’s super classes must have been executed before a class can be used, i.e. before creating instances or before invoking static methods.

This differs from the resolution of all other referenced types which can be deferred far longer. It is permissible to resolve a type used in a method just before the method is invoked the first time.

When you look at the beginning of the Chapter 5.4.3. Resolution, there’s stated explicitly:

The Java Virtual Machine instructions anewarray, checkcast, getfield, getstatic, instanceof, invokedynamic, invokeinterface, invokespecial, invokestatic, invokevirtual, ldc, ldc_w, multianewarray, new, putfield, and putstatic make symbolic references to the run-time constant pool. Execution of any of these instructions requires resolution of its symbolic reference.

So the difference is made quite clear. There’s the resolving of the direct super class and the directly implemented interfaces (or super interfaces in case of an interface) which happens early and there’s the resolution of symbolic references for the purpose of the above byte code instructions which can be postponed.

like image 90
Holger Avatar answered Oct 07 '22 18:10

Holger