Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Object representation in JVM

In the Java® Virtual Machine Specification, section 2.7 Representation of Objects, it says :

In some of Oracle’s implementations of the Java Virtual Machine, a reference to a class instance is a pointer to a handle that is itself a pair of pointers: one to a table containing the methods of the object and a pointer to the Class object that represents the type of the object, and the other to the memory allocated from the heap for the object data.

I have the impression that the handle has three pointers instead of two :

  • One pointer to a table containing the methods of the object (in the method area).
  • A pointer to the Class object that represents the type of the object (in the method area).
  • A pointer to the memory allocated from the heap for the object data.

Could anyone please clear up this confusion for me.

like image 282
ecdhe Avatar asked Dec 24 '22 12:12

ecdhe


2 Answers

First of all, the quoted text says "In some of Oracle’s implementations of the Java Virtual Machine ...". This description is not correct for all implementations.

It think your misunderstanding is based on a mis-parsing of the text. What I think it is actually saying is this:

  1. A handle is a pointer to a pair of pointers.
  2. The first pointer of the pair points to a table.
  3. The table contains the methods of the object.
  4. The table also contains a pointer to the Class object.
  5. The second pointer in the pair points to the memory area that holds the object data.

The handle has two pointers, not three.


My understanding is that in modern Oracle JVMs, the representation of a handle is as follows:

  1. A handle is a pointer to an node in the heap.
  2. The node (for an ordinary object) consists of two header words followed by zero or more words that hold the object data.
  3. The two header words consist of the following:
    • "klass" word containing an index into the JVM's table of class descriptors
    • a flag word contains bits used for various purposes: e.g. GC "mark" bits, bits to represent the object's primitive-lock state, bits related to identity hashcodes, the object's GC age and so on.
  4. The class descriptor includes the method table and the reference to the Class object. It also gives the size (in words) of an instance's data area that the GC needs.

The representation for arrays is a bit different. For a start, there is a 3rd word containing the array length.

(See also What is in java object header)


But note that these details are implementation specific. Particularly the flag word.

like image 108
Stephen C Avatar answered Jan 02 '23 08:01

Stephen C


In terms of how the Oracle is actually implemented is just a detail which isn't specified. A reference is a pointer (possibly compressed, see Compressed Oops) to the object's header and data.

In the header is a pointer to the class definition. You can see this using Unsafe. In this case, I have assumed compressed oops have to be turned off to keep things simple.

WARNING This is not suitable for work, only try this at home. ;)

When run on Oracle Java 8. Another JVM version might be different.

Object i = 0x12345678;

// set the System.identityHashCode
UNSAFE.putInt(i, 1L, 0x23456789);
System.out.printf("identityHashCode: %x%n", System.identityHashCode(i));

Object[] obj = {i};
assert Unsafe.ARRAY_OBJECT_INDEX_SCALE == 8; // 8 bytes per reference.
long address = UNSAFE.getLong(obj, (long) Unsafe.ARRAY_OBJECT_BASE_OFFSET);
System.out.printf("address: %x%n", address);
for (int j = 0; j < 24; j++)
    System.out.printf("%02x ", UNSAFE.getByte(address + j) & 0xFF);
System.out.println();

System.out.printf("`i` is a %s and is %x%n", i.getClass(), i);

// now some really scary sh!t
long longClassPointer = UNSAFE.getLong(0L, 8L);
UNSAFE.putLong(i, 8L, longClassPointer);
System.out.printf("`i` is now a %s and is %x%n", i.getClass(), i);

prints the hashCode set, the 64-bit address, the contents of the object and header, and finally what happens if you change the klass pointer to that of a Long class instead.

identityHashCode: 23456789
address: 1d3f9ca00
01 89 67 45 23 00 00 00 a0 74 f8 26 00 00 00 00 78 56 34 12 00 00 00 00 
`i` is a class java.lang.Integer and is 12345678
`i` is now a class java.lang.Long and is 12345678

for the complete code https://github.com/peter-lawrey/Performance-Examples/blob/master/src/main/java/vanilla/java/unsafe/AccessRawMemoryMain.java

like image 41
Peter Lawrey Avatar answered Jan 02 '23 08:01

Peter Lawrey