Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regarding HashMap implementation in java

I was trying to do research on hashmap and came up with the following analysis:

https://stackoverflow.com/questions/11596549/how-does-javas-hashmap-work-internally/18492835#18492835

Q1 Can you guys show me a simple map where you can show the process..that how hashcode for the given key is calculated in detail by using this formula ..Calculate position hash % (arrayLength-1)) where element should be placed(bucket number), let say I have this hashMap

HashMap map=new HashMap();//HashMap key random order.
         map.put("Amit","Java");
         map.put("Saral","J2EE");

Q2 Sometimes it might happen that hashCodes for 2 different objects are the same. In this case 2 objects will be saved in one bucket and will be presented as LinkedList. The entry point is more recently added object. This object refers to other objest with next field and so one. Last entry refers to null. Can you guys show me this with real example..!!

.

"Amit" will be distributed to the 10th bucket, because of the bit twiddeling. If there were no bit twiddeling it would go to the 7th bucket, because 2044535 & 15 = 7. how this is possible please explanin detail the whole calculation..?

Snapshots updated...

enter image description here

and the other image is ...

enter image description here

like image 489
DON Avatar asked Nov 14 '22 01:11

DON


1 Answers

that how hashcode for the given key is calculated in detail by using this formula

In case of String this is calculated by String#hashCode(); which is implemented as follows:

 public int hashCode() {
    int h = hash;
        int len = count;
    if (h == 0 && len > 0) {
        int off = offset;
        char val[] = value;

            for (int i = 0; i < len; i++) {
                h = 31*h + val[off++];
            }
            hash = h;
        }
        return h;
    }

Basically following the equation in the java doc

 hashcode = s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

One interesting thing to note on this implementation is that String actually caches its hash code. It can do this, because String is immutable.

If I calculate the hashcode of the String "Amit", it will yield to this integer:

System.out.println("Amit".hashCode());
>     2044535

Let's get through a simple put to a map, but first we have to determine how the map is built. The most interesting fact about a Java HashMap is that it always has 2^n buckets. So if you call it, the default number of buckets is 16, which is obviously 2^4.

Doing a put operation on this map, it will first get the hashcode of the key. There happens some fancy bit twiddeling on this hashcode to ensure that poor hash functions (especially those that do not differ in the lower bits) don't "overload" a single bucket.

The real function that is actually responsible for distributing your key to the buckets is the following:

 h & (length-1); // length is the current number of buckets, h the hashcode of the key

This only works for power of two bucket sizes, because it uses & to map the key to a bucket instead of a modulo.

"Amit" will be distributed to the 10th bucket, because of the bit twiddeling. If there were no bit twiddeling it would go to the 7th bucket, because 2044535 & 15 = 7.

Now that we have an index for it, we can find the bucket. If the bucket contains elements, we have to iterate over them and replace an equal entry if we find it. If none item has been found in the linked list we will just add it at the beginning of the linked list.

The next important thing in HashMap is the resizing, so if the actual size of the map is above over a threshold (determined by the current number of buckets and the loadfactor, in our case 16*0.75=12) it will resize the backing array. Resize is always 2 * the current number of buckets, which is guranteed to be a power of two to not break the function to find the buckets.

Since the number of buckets change, we have to rehash all the current entries in our table. This is quite costly, so if you know how many items there are, you should initialize the HashMap with that count so it does not have to resize the whole time.

like image 172
Thomas Jungblut Avatar answered Dec 31 '22 03:12

Thomas Jungblut