Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hash Map Memory Overhead

I was studying about internal structure of Hash Map where i came across following details:

  • Implementation is an array of HashMap$Entry objects:

  • Each HashMap$Entry contains: – int KeyHash – Object next – Object key – Object Value

  • Default capacity is 16 entries

  • Empty size is 128 bytes

  • Overhead is 48 bytes for HashMap, plus (16 + (entries * 4bytes)) for array – Plus overhead of HashMap$Entry objects

  • Additional 32bytes per key ↔ value entry Overhead of HashMap is
    therefore: – 48 bytes, plus 36 bytes per entry

Can anyone please explain me "Overhead is 48 bytes for HashMap, plus (16 + (entries * 4bytes)) for array" and "Additional 32bytes per key ↔ value entry Overhead of HashMap is
therefore: – 48 bytes, plus 36 bytes per entry"
???

I can't understand how these conclusions came i.e. how we came across this final memory details about hash map.

like image 993
Encyclopedia Avatar asked Nov 24 '13 13:11

Encyclopedia


1 Answers

You might want to read this SO question. The object overhead will vary between different CPU architectures, VM implementations and between different Java versions (e.g. OOPS, there are also suggestions of fixnums etc.).

But let's see for ourselves for OpenJDK 7, the HashMap class:


Overhead is 48 bytes for HashMap

Housekeeping info. Presumably 8 bytes.

There are three fields in the implementation that hold the views made by keySet(), values() and entrySet() methods. Three pointers, that's 12 bytes on a 32-bit machine.

// from AbstractMap
transient volatile Set<K>        keySet = null;
transient volatile Collection<V> values = null;

// from HashMap
private transient Set<Map.Entry<K,V>> entrySet = null

There are three int fields: size, threshold and modCount. 12 bytes.

There is one float field: loadFactor. 4 bytes.

The table pointer itself (Entry[] table). 4 bytes.

(the static final fields don't count, they are compile-time constants)

All that gives us 40 bytes flat cost of a HashMap instance. It's not 48 bytes and I don't know why. Maybe I missed something, maybe your text referred to other version. These things sometimes change a little. There may have been an extra field in the past. Maybe it's padded from 40 to 48 bytes (but it shouldn't).


16 + (entries * 4 bytes) for array

The Entry[] table table is instantiated when the HashMap is constructed, we need to count it, too.

An instantiated array takes 8 bytes of housekeeping data, 4 bytes for the length property. That's 12 bytes, not 16. Again, no idea where that comes from.

Every entry is a pointer to an Entry object, so that's 4 bytes per entry. That was easy.


Additional 32 bytes per key ↔ value entry

Again, 8 bytes of housekeeping.

Now, the fields:

final K key;    // a reference, 4 bytes
V value;        // a reference, 4 bytes
Entry<K,V> next;  // a reference, 4 bytes
final int hash; // an integer primitive, 4 bytes

16 bytes.

That's a final count of 24 bytes per entry. Again, that's not 36.


I have no idea where the numbers you got there come from. Could that be an IBM vm? Could that be a 64-bit OS? Maybe the the housekeeping info is 16 bytes nowadays (will Java 8 change anything?).

Anyway, I tried to compute the memory consumption to my best knowledge.

like image 57
Petr Janeček Avatar answered Oct 05 '22 08:10

Petr Janeček