When using a hash map, it's important to evenly distribute the keys over the buckets.
If all keys end up in the same bucket, you essentially end up with a list.
Is there a way to "audit" a HashMap in Java in order to see how well the keys are distributed?
I tried subtyping it and iterating Entry<K,V>[] table
, but it's not visible.
HashMap containsKey() Method in Java HashMap. containsKey() method is used to check whether a particular key is being mapped into the HashMap or not. It takes the key element as a parameter and returns True if that element is mapped in the map.
A HashMap has number of buckets (implemented as an array) in which to store entries. When an item is added to the map, it is assigned to a buckets based on a value derived of its hashCode and the bucket size of the HashMap . (Note that it's possible that the bucket is already occupied, which is called a collision.
Solution #1: Sorting a HashMap Using a LinkedHashMap As we know, the key-value pairs stored in a HashMap do not have an order.
I tried subtyping it and iterating Entry[] table, but it's not visible
Use Reflection API!
public class Main {
//This is to simulate instances which are not equal but go to the same bucket.
static class A {
@Override
public boolean equals(Object obj) { return false;}
@Override
public int hashCode() {return 42; }
}
public static void main(String[] args) {
//Test data
HashMap<A, String> map = new HashMap<A, String>(4);
map.put(new A(), "abc");
map.put(new A(), "def");
//Access to the internal table
Class clazz = map.getClass();
Field table = clazz.getDeclaredField("table");
table.setAccessible(true);
Map.Entry<Integer, String>[] realTable = (Map.Entry<Integer, String>[]) table.get(map);
//Iterate and do pretty printing
for (int i = 0; i < realTable.length; i++) {
System.out.println(String.format("Bucket : %d, Entry: %s", i, bucketToString(realTable[i])));
}
}
private static String bucketToString(Map.Entry<Integer, String> entry) throws Exception {
if (entry == null) return null;
StringBuilder sb = new StringBuilder();
//Access to the "next" filed of HashMap$Node
Class clazz = entry.getClass();
Field next = clazz.getDeclaredField("next");
next.setAccessible(true);
//going through the bucket
while (entry != null) {
sb.append(entry);
entry = (Map.Entry<Integer, String>) next.get(entry);
if (null != entry) sb.append(" -> ");
}
return sb.toString();
}
}
In the end you'll see something like this in STDOUT:
Bucket : 0, Entry: null
Bucket : 1, Entry: null
Bucket : 2, Entry: Main$A@2a=abc -> Main$A@2a=def
Bucket : 3, Entry: null
HashMap
uses the keys produced by the hashCode()
method of your key objects, so I guess you are really asking how evenly distributed those hash code values are. You can get hold of the key objects using Map.keySet()
.
Now, the OpenJDK and Oracle implementations of HashMap
do not use the key hash codes directly, but apply another hashing function to the provided hashes before distributing them over the buckets. But you should not rely on or use this implementation detail. So you ought to ignore it. So you should just ensure that the hashCode()
methods of your key values are well distributed.
Examining the actual hash codes of some sample key value objects is unlikely to tell you anything useful unless your hash cide method is very poor. You would be better doing a basic theoretical analysis of your hash code method. This is not as scary as it might sound. You may (indeed, have no choice but to do so) assume that the hash code methods of the supplied Java classes are well distributed. Then you just need a check that the means you use for combining the hash codes for your data members behaves well for the expected values of your data members. Only if your data members have values that are highly correlated in a peculiar way is this likely to be a problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With