I was trying to do research on hashmap and came up with the following analysis:
https://stackoverflow.com/questions/11596549/how-does-javas-hashmap-work-internally/18492835#18492835
Q1 Can you guys show me a simple map where you can show the process..that how hashcode for the given key is calculated in detail by using this formula ..Calculate position hash % (arrayLength-1)) where element should be placed(bucket number), let say I have this hashMap
HashMap map=new HashMap();//HashMap key random order.
map.put("Amit","Java");
map.put("Saral","J2EE");
Q2 Sometimes it might happen that hashCodes for 2 different objects are the same. In this case 2 objects will be saved in one bucket and will be presented as LinkedList. The entry point is more recently added object. This object refers to other objest with next field and so one. Last entry refers to null. Can you guys show me this with real example..!!
.
"Amit" will be distributed to the 10th bucket, because of the bit twiddeling. If there were no bit twiddeling it would go to the 7th bucket, because 2044535 & 15 = 7. how this is possible please explanin detail the whole calculation..?
Snapshots updated...
and the other image is ...
that how hashcode for the given key is calculated in detail by using this formula
In case of String
this is calculated by String#hashCode();
which is implemented as follows:
public int hashCode() {
int h = hash;
int len = count;
if (h == 0 && len > 0) {
int off = offset;
char val[] = value;
for (int i = 0; i < len; i++) {
h = 31*h + val[off++];
}
hash = h;
}
return h;
}
Basically following the equation in the java doc
hashcode = s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
One interesting thing to note on this implementation is that String
actually caches its hash code. It can do this, because String
is immutable.
If I calculate the hashcode of the String
"Amit", it will yield to this integer:
System.out.println("Amit".hashCode());
> 2044535
Let's get through a simple put to a map, but first we have to determine how the map is built.
The most interesting fact about a Java HashMap
is that it always has 2^n buckets. So if you call it, the default number of buckets is 16, which is obviously 2^4.
Doing a put operation on this map, it will first get the hashcode of the key. There happens some fancy bit twiddeling on this hashcode to ensure that poor hash functions (especially those that do not differ in the lower bits) don't "overload" a single bucket.
The real function that is actually responsible for distributing your key to the buckets is the following:
h & (length-1); // length is the current number of buckets, h the hashcode of the key
This only works for power of two bucket sizes, because it uses & to map the key to a bucket instead of a modulo.
"Amit" will be distributed to the 10th bucket, because of the bit twiddeling. If there were no bit twiddeling it would go to the 7th bucket, because 2044535 & 15 = 7
.
Now that we have an index for it, we can find the bucket. If the bucket contains elements, we have to iterate over them and replace an equal entry if we find it. If none item has been found in the linked list we will just add it at the beginning of the linked list.
The next important thing in HashMap
is the resizing, so if the actual size of the map is above over a threshold (determined by the current number of buckets and the loadfactor, in our case 16*0.75=12) it will resize the backing array.
Resize is always 2 * the current number of buckets, which is guranteed to be a power of two to not break the function to find the buckets.
Since the number of buckets change, we have to rehash all the current entries in our table.
This is quite costly, so if you know how many items there are, you should initialize the HashMap
with that count so it does not have to resize the whole time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With