Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the size cost of Java inheritance?

There are various articles out there on the interwebs that try to empirically estimate the overhead of a java.lang.Object in particular JVM implementations. For example, I've seen the size overhead of a bare Object estimated at 8 bytes in some JVMs.

What I would like to know is whether a typical JVM implementation of the extends relationship introduces incremental size overhead at every level of the class hierarchy. In other words, suppose you have a class hierarchy with N levels of subclasses. Is the overhead of the in-memory representation of a class instance O(1) or O(N)?

I imagine it is O(1) because although the size of some of the hidden fluffy stuff you need to be a Java Object (vtable, chain of classes) will grow as the inheritance hierarchy grows, they grow per-class, not per-instance, and the JVM implementation can store constant-size pointers to these entities in a constant-size header attached to every Object.

So in theory, the overhead directly attached to the in-memory representation of any Java object should be O(1) for inheritance depth N. Does anyone know if it's true in practice?

like image 289
0xbe5077ed Avatar asked May 30 '14 05:05

0xbe5077ed


People also ask

What is size of inherited class in Java?

The reason for sizeof(DerivedClass) being 16 bytes in MC++ is that it starts each class with a 4 byte aligned address so that accessing the member of that class will be easy (again, the memory read/write cycle).

What cost are associated with inheritance in Java?

Inheritance decreases the execution speed due to the increased time and effort it takes, the program to jump through all the levels of overloaded classes. Inheritance makes the two classes (base and inherited class) get tightly coupled. This means one cannot be used independently of each other.

What is the inheritance of Java?

Inheritance is one of the key features of OOP that allows us to create a new class from an existing class. The new class that is created is known as subclass (child or derived class) and the existing class from where the child class is derived is known as superclass (parent or base class).

How many levels of inheritance are there?

Single level inheritance: It will signify only two levels of classes. Base class and derived class. Multilevel inheritance: It will signify three or more levels of classes. Derived class, base class and one or more intermediate classes.


4 Answers

When in doubt, look at the source (well, a source; each JVM is free to choose how to do it, as the standard does not mandate any internal representation). So I had a look, and found the following comment within the implementation of JDK 7-u60's hotspot JVM:

// A Klass is the part of the klassOop that provides:
//  1: language level class object (method dictionary etc.)
//  2: provide vm dispatch behavior for the object
// Both functions are combined into one C++ class. The toplevel class "Klass"
// implements purpose 1 whereas all subclasses provide extra virtual functions
// for purpose 2.

// One reason for the oop/klass dichotomy in the implementation is
// that we don't want a C++ vtbl pointer in every object.  Thus,
// normal oops don't have any virtual functions.  Instead, they
// forward all "virtual" functions to their klass, which does have
// a vtbl and does the C++ dispatch depending on the object's

The way I read it, this means that, for this (very popular) implementation, object instances only store a pointer to their class. The cost, per-instance of a class, of that class having a longer or shorter inheritance chain is effectively 0. The classes themselves do take up space in memory though (but only once per class). Run-time efficiency of deep inheritance chains is another matter.

like image 130
tucuxi Avatar answered Oct 22 '22 14:10

tucuxi


The JVM specification states

The Java Virtual Machine does not mandate any particular internal structure for objects.

So the specification does not care how you do it. But...

In some of Oracle’s implementations of the Java Virtual Machine, a reference to a class instance is a pointer to a handle that is itself a pair of pointers: one to a table containing the methods of the object and a pointer to the Class object that represents the type of the object, and the other to the memory allocated from the heap for the object data.

So in typical Oracle implementations it is O(1) for methods. This method table is the Method Area which is per class.

The Java Virtual Machine has a method area that is shared among all Java Virtual Machine threads. The method area is analogous to the storage area for compiled code of a conventional language or analogous to the "text" segment in an operating system process. It stores per-class structures such as the run-time constant pool, field and method data, and the code for methods and constructors, including the special methods (§2.9) used in class and instance initialization and interface initialization.

Also, about method entries

The method_info structures represent all methods declared by this class or interface type, including instance methods, class methods, instance initialization methods (§2.9), and any class or interface initialization method (§2.9). The methods table does not include items representing methods that are inherited from superclasses or superinterfaces.

like image 28
Sotirios Delimanolis Avatar answered Oct 22 '22 14:10

Sotirios Delimanolis


An instance generally requires the following data, although it's up to the implementation exactly what to do:

  • the instance fields of the class and its parent classes, which I assume you don't mean to include in the term "overhead"
  • some means to lock the object
  • if the garbage collector relocates objects, then some means to record the original hash of the object (for Object.hashCode)
  • some means to access type information

As you guess in your question, in a "normal" Java implementation the type information is stored per-class, not per instance. Part of the definition of "type" is that two instances of the same class necessarily have the same type information, there's no obvious reason not to share it. So you would expect that per-instance overhead to be constant, not to depend on the class hierarchy.

That is to say, adding extra empty classes or interfaces to a class should not increase the size of its instances. I don't think either the language or the JVM spec actually guarantees this, though, so don't make too many assumptions about what a "non-normal" Java implementation is allowed to do.

As an aside, the second and third things on my list can be combined via cunning trickery, so that both of them together are a single pointer. The article you link to refers to references taking 4 bytes, so the 8 bytes it comes up with for an object is one pointer to type information, one field containing either a hashcode or a pointer to a monitor, and probably some flags in the lowest 2 bits of one or both of those pointer fields. Object will (you'd expect) be larger on a 64-bit Java.

like image 4
Steve Jessop Avatar answered Oct 22 '22 12:10

Steve Jessop


Double and Integer, which extend Number, which extends Object, do not have an O(n) behavior, that is, an Integer is not 3X the size of an Object, so I think the answer is O(1). e.g. see this old SO question

like image 2
user949300 Avatar answered Oct 22 '22 13:10

user949300