Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

are HashMaps with predefined capacity faster

Tags:

java

hashmap

I came across an algorithm on the net http://www.coderanch.com/t/201836/Performance/java/Hashtable-vs-Hashmap and decided to test it

public class MapTest{
    static int sizeOfTrial = 100000;
    static String[] keys = new String[sizeOfTrial];
    static String[] vals = new String[sizeOfTrial];

    public static void main(String[] args) {
        //init sizeOfTrial key/value pairs
        for (int i=0; i < sizeOfTrial; i++){
          String s1 = "key"+ i;
          String s2 = "val"+ i;
          keys[i] = s1;
          vals[i] = s2;
        }
        test(new TreeMap(), "TreeMap");
        test(new Hashtable(), "Hashtable");
        test(new HashMap(), "HashMap");
        test(new Hashtable(200000), "Hashtable presized");
        test(new HashMap(200000), "HashMap presized");
    }

  public static void test(Map tm, String name){
    long t1 = System.currentTimeMillis();
    for (int i=0; i < sizeOfTrial; i++){
      tm.put(keys[i],vals[i]);
    }
    for (int i=0; i < sizeOfTrial; i++){
      tm.get(keys[i]);
    }
    long t2 = System.currentTimeMillis();
    System.out.println("total time for " + name + ": " + (t2-t1));
  }
}

and i got the following results

total time for TreeMap: 1744
total time for Hashtable: 446
total time for HashMap: 234
total time for Hashtable presized: 209
total time for HashMap presized: 196

Is this JVM dependent and arbitrary or does it really provide faster access and storage time?

like image 569
Nav Avatar asked Apr 20 '12 01:04

Nav


People also ask

Why are hashmaps so fast?

Hashmaps use the hashcode of the key to access directly the bucket where the entry is stored. This is an O(1) access. If more than one element is in that bucket because of the same or similar hashcode, then you have a few more checks, but it's still way faster than iterating through a list and searching for an element.

What can cause a HashMap to be less efficient?

HashMap's Bottleneck Because non-equal objects can have the same hash codes (a phenomenon called hash code collision), buckets can grow in size.

Are hashmaps faster than ArrayList?

The ArrayList has O(n) performance for every search, so for n searches its performance is O(n^2). The HashMap has O(1) performance for every search (on average), so for n searches its performance will be O(n). While the HashMap will be slower at first and take more memory, it will be faster for large values of n.

Does HashMap size affects the performance of HashMap?

HashMap 's get has an expected constant running time, which means its running time shouldn't depend on the size of the HashMap . This, of course, relies on a decent implementation of the hashCode method of your key, but your key is String , so it shouldn't be a problem.


1 Answers

Predefining the expected size of any container-type class will provide faster storage time simply because the storage doesn't have to be dynamically re-allocated at runtime as often. Usually the backing storage is some sort of array, and when you go beyond the available capacity, the array has to be copied into a new larger array. This is an expensive operation that may have to happen multiple times if you store a large number of objects into a container that started at a very small capacity.

The performance of reading from the map shouldn't be affected either way. You could demonstrate this better by timing the tm.put part separately from the tm.get part.


Edit: To illustrate this point further, I modified the code to time tm.put separately from tm.get. Here are the results on my machine:

total time for TreeMap tm.put: 159
total time for TreeMap tm.get: 74
total time for Hashtable tm.put: 20
total time for Hashtable tm.get: 10
total time for HashMap tm.put: 42
total time for HashMap tm.get: 5
total time for Hashtable presized tm.put: 11
total time for Hashtable presized tm.get: 9
total time for HashMap presized tm.put: 6
total time for HashMap presized tm.get: 4

Notice that the difference between Hashtable regular and presized for tm.put is a factor of ~2. SImilarly, with HashMap the difference between regular and presized is a factor of ~7 for storing. However, looking at the reading side, both Hashtable and Hashmap have roughly the same timings for tm.get in both cases (10 ms vs 9 ms for Hashtable, and 5 ms vs 4 ms for HashMap). Also notice that in the presized case, both putting and getting take roughly about the same amount of total time.

like image 153
mellamokb Avatar answered Oct 05 '22 13:10

mellamokb