I am wondering what is the memory overhead of java HashMap compared to ArrayList? Update: I would like to improve the speed for searching for specific values of a big pack (6 Millions+) of identical objects. Thus, I am thinking about using one or several HashMap instead of using ArrayList. But I am wondering what is the overhead of HashMap. As far as i understand, the key is not stored, only the hash of the key, so it should be something like size of the hash of the object + one pointer. But what hash function is used? Is it the one offered by Object or another one?

If you're comparing HashMap with ArrayList, I presume you're doing some sort of searching/indexing of the ArrayList, such as binary search or custom hash table...? Because a .get(key) thru 6 million entries would be infeasible using a linear search. Using that assumption, I've done some empirical tests and come up with the conclusion that "You can store 2.5 times as many small objects in the same amount of RAM if you use ArrayList with binary search or custom hash map implementation, versus HashMap". My test was based on small objects containing only 3 fields, of which one is the key, and the key is an integer. I used a 32bit jdk 1.6. See below for caveats on this figure of "2.5". The key things to note are: (a) it's not the space required for references or "load factor" that kills you, but rather the overhead required for object creation. If the key is a primitive type, or a combination of 2 or more primitive or reference values, then each key will require its own object, which carries an overhead of 8 bytes. (b) In my experience you usually need the key as part of the value, (e.g. to store customer records, indexed by customer id, you still want the customer id as part of the Customer object). This means it is IMO somewhat wasteful that a HashMap separately stores references to keys and values. Caveats: <ol> <li>The most common type used for HashMap keys is String. The object creation overhead doesn't apply here so the difference would be less.</li> <li>I got a figure of 2.8, being 8880502 entries inserted into the ArrayList compared with 3148004 into the HashMap on -Xmx256M JVM, but my ArrayList load factor was 80% and my objects were quite small - 12 bytes plus 8 byte object overhead.</li> <li>My figure, and my implementation, requires that the key is contained within the value, otherwise I'd have the same problem with object creation overhead and it would be just another implementation of HashMap.</li> </ol> My code: <pre class="prettyprint"><code>public class Payload { int key,b,c; Payload(int _key) { key = _key; } } import org.junit.Test; import java.util.HashMap; import java.util.Map; public class Overhead { @Test public void useHashMap() { int i=0; try { Map<Integer, Payload> map = new HashMap<Integer, Payload>(); for (i=0; i < 4000000; i++) { int key = (int)(Math.random() * Integer.MAX_VALUE); map.put(key, new Payload(key)); } } catch (OutOfMemoryError e) { System.out.println("Got up to: " + i); } } @Test public void useArrayList() { int i=0; try { ArrayListMap map = new ArrayListMap(); for (i=0; i < 9000000; i++) { int key = (int)(Math.random() * Integer.MAX_VALUE); map.put(key, new Payload(key)); } } catch (OutOfMemoryError e) { System.out.println("Got up to: " + i); } } } import java.util.ArrayList; public class ArrayListMap { private ArrayList<Payload> map = new ArrayList<Payload>(); private int[] primes = new int[128]; static boolean isPrime(int n) { for (int i=(int)Math.sqrt(n); i >= 2; i--) { if (n % i == 0) return false; } return true; } ArrayListMap() { for (int i=0; i < 11000000; i++) // this is clumsy, I admit map.add(null); int n=31; for (int i=0; i < 128; i++) { while (! isPrime(n)) n+=2; primes[i] = n; n += 2; } System.out.println("Capacity = " + map.size()); } public void put(int key, Payload value) { int hash = key % map.size(); int hash2 = primes[key % primes.length]; if (hash < 0) hash += map.size(); do { if (map.get(hash) == null) { map.set(hash, value); return; } hash += hash2; if (hash >= map.size()) hash -= map.size(); } while (true); } public Payload get(int key) { int hash = key % map.size(); int hash2 = primes[key % primes.length]; if (hash < 0) hash += map.size(); do { Payload payload = map.get(hash); if (payload == null) return null; if (payload.key == key) return payload; hash += hash2; if (hash >= map.size()) hash -= map.size(); } while (true); } } </code></pre>

Memory overhead of Java HashMap compared to ArrayList

2 Answers

If you're comparing HashMap with ArrayList, I presume you're doing some sort of searching/indexing of the ArrayList, such as binary search or custom hash table...? Because a .get(key) thru 6 million entries would be infeasible using a linear search.

Using that assumption, I've done some empirical tests and come up with the conclusion that "You can store 2.5 times as many small objects in the same amount of RAM if you use ArrayList with binary search or custom hash map implementation, versus HashMap". My test was based on small objects containing only 3 fields, of which one is the key, and the key is an integer. I used a 32bit jdk 1.6. See below for caveats on this figure of "2.5".

The key things to note are:

(a) it's not the space required for references or "load factor" that kills you, but rather the overhead required for object creation. If the key is a primitive type, or a combination of 2 or more primitive or reference values, then each key will require its own object, which carries an overhead of 8 bytes.

(b) In my experience you usually need the key as part of the value, (e.g. to store customer records, indexed by customer id, you still want the customer id as part of the Customer object). This means it is IMO somewhat wasteful that a HashMap separately stores references to keys and values.

Caveats:

The most common type used for HashMap keys is String. The object creation overhead doesn't apply here so the difference would be less.
I got a figure of 2.8, being 8880502 entries inserted into the ArrayList compared with 3148004 into the HashMap on -Xmx256M JVM, but my ArrayList load factor was 80% and my objects were quite small - 12 bytes plus 8 byte object overhead.
My figure, and my implementation, requires that the key is contained within the value, otherwise I'd have the same problem with object creation overhead and it would be just another implementation of HashMap.

My code:

public class Payload {     int key,b,c;     Payload(int _key) { key = _key; } }   import org.junit.Test;  import java.util.HashMap; import java.util.Map;   public class Overhead {     @Test     public void useHashMap()     {         int i=0;         try {             Map<Integer, Payload> map = new HashMap<Integer, Payload>();             for (i=0; i < 4000000; i++) {                 int key = (int)(Math.random() * Integer.MAX_VALUE);                 map.put(key, new Payload(key));             }         }         catch (OutOfMemoryError e) {             System.out.println("Got up to: " + i);         }     }      @Test     public void useArrayList()     {         int i=0;         try {             ArrayListMap map = new ArrayListMap();             for (i=0; i < 9000000; i++) {                 int key = (int)(Math.random() * Integer.MAX_VALUE);                 map.put(key, new Payload(key));             }         }         catch (OutOfMemoryError e) {             System.out.println("Got up to: " + i);         }     } }   import java.util.ArrayList;   public class ArrayListMap {     private ArrayList<Payload> map = new ArrayList<Payload>();     private int[] primes = new int[128];      static boolean isPrime(int n)     {         for (int i=(int)Math.sqrt(n); i >= 2; i--) {             if (n % i == 0)                 return false;         }         return true;     }      ArrayListMap()     {         for (int i=0; i < 11000000; i++)    // this is clumsy, I admit             map.add(null);         int n=31;         for (int i=0; i < 128; i++) {             while (! isPrime(n))                 n+=2;             primes[i] = n;             n += 2;         }         System.out.println("Capacity = " + map.size());     }      public void put(int key, Payload value)     {         int hash = key % map.size();         int hash2 = primes[key % primes.length];         if (hash < 0)             hash += map.size();         do {             if (map.get(hash) == null) {                 map.set(hash, value);                 return;             }             hash += hash2;             if (hash >= map.size())                 hash -= map.size();         } while (true);     }      public Payload get(int key)     {         int hash = key % map.size();         int hash2 = primes[key % primes.length];         if (hash < 0)             hash += map.size();         do {             Payload payload = map.get(hash);             if (payload == null)                 return null;             if (payload.key == key)                 return payload;             hash += hash2;             if (hash >= map.size())                 hash -= map.size();         } while (true);     } }

128

answered Oct 14 '22 12:10

Tim Cooper

The simplest thing would be to look at the source and work it out that way. However, you're really comparing apples and oranges - lists and maps are conceptually quite distinct. It's rare that you would choose between them on the basis of memory usage.

What's the background behind this question?

answered Oct 14 '22 10:10

Jon Skeet

Related questions
                            
                                Hibernate: Where do insertable = false, updatable = false belong in composite primary key constellations involving foreign keys?
                            
                                Where are compiled JSP Java (*__jsp.java) files?
                            
                                side effect for increasing maxpermsize and max heap size
                            
                                CQRS and Event Sourcing Difference
                            
                                Can a static nested class be instantiated in Java?
                            
                                H2: how to tell if table exists?
                            
                                Android Studio - Keystore was tampered with, or password was incorrect
                            
                                Why are the 'Arrays' class' methods all static in Java?
                            
                                When to use a List over an Array in Java?
                            
                                What is unchecked cast and how do I check it?
                            
                                why Integer.MAX_VALUE + 1 == Integer.MIN_VALUE?
                            
                                How to insert multiple documents at once in MongoDB through Java
                            
                                Hibernate auto create database
                            
                                "Could not find or load main class" Error while running java program using cmd prompt [duplicate]
                            
                                Conversion of DTO to entity and vice-versa
                            
                                How do I copy DOM nodes from one document to another in Java?
                            
                                Loop through all elements in XML using NodeList
                            
                                Why List<String> is not acceptable as List<Object>? [duplicate]
                            
                                upgrade eclipse to java 8
                            
                                JSP vs Velocity what is better?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Memory overhead of Java HashMap compared to ArrayList

Tags:

java

memory-management

hashmap

arraylist

elhoim

People also ask

2 Answers

Tim Cooper

Jon Skeet

Recent Activity

Donate For Us