I have updated a Java application to Java 8. The application heavily relies on HashMaps. When I run the benchmarks, I see unpredictable behavoir. For some inputs, the application runs faster than before, but for larger inputs, it's constantly slower.
I've checked the profiler and the most time consuming operation is HashMap.get. I suspect the changes are due to the HashMap modification in Java 8, but it may not be true, as I have changed some other parts.
Is there an easy way that I hook in the original Java 7 HashMap into my Java 8 application so that I only change the hashmap implementation to see if I still observe the change in performance.
The following is a minimal program that tries to simulate what my application is doing. The basic idea is that i need to share nodes in the application. At some runtime point, a node should be retrieved or created if it already does not exist based on some integer properties. The following only uses two integer, but in the real application I have one, two and three integer keys.
import java.util.HashMap;
import java.util.Map;
import java.util.Random;
public class Test1 {
static int max_k1 = 500;
static int max_k2 = 500;
static Map<Node, Node> map;
static Random random = new Random();
public static void main(String[] args) {
for (int i = 0; i < 15; i++) {
long start = System.nanoTime();
run();
long end = System.nanoTime();
System.out.println((end - start) / 1000_000);
}
}
private static void run() {
map = new HashMap<>();
for (int i = 0; i < 10_000_000; i++) {
Node key = new Node(random.nextInt(max_k1), random.nextInt(max_k2));
Node val = getOrElseUpdate(key);
}
}
private static Node getOrElseUpdate(Node key) {
Node val;
if ((val = map.get(key)) == null) {
val = key;
map.put(key, val);
}
return val;
}
private static class Node {
private int k1;
private int k2;
public Node(int k1, int k2) {
this.k1 = k1;
this.k2 = k2;
}
@Override
public int hashCode() {
int result = 17;
result = 31 * result + k1;
result = 31 * result + k2;
return result;
}
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (!(obj instanceof Node))
return false;
Node other = (Node) obj;
return k1 == other.k1 && k2 == other.k2;
}
}
}
The benchmarking is primitive, but still, this is the result of 15 runs on Java 8:
8143
7919
7984
7973
7948
7984
7931
7992
8038
7975
7924
7995
6903
7758
7627
and this is for Java 7:
7247
6955
6510
6514
6577
6489
6510
6570
6497
6482
6540
6462
6514
4603
6270
The benchmarking is primitive, so I appreciate if someone who is familiar with JMH or other benchmarking tools run it, but from what I observe the results are better for Java 7. Any ideas?
In Java 7 after calculating hash from hash function if more then one element has same hash than they are searched by linear search so it's complexity is (n). In Java 8 that search is performed by binary search so the complexity will become log(n).
In Java 8, HashMap replaces linked list with a binary tree when the number of elements in a bucket reaches certain threshold. While converting the list to binary tree, hashcode is used as a branching variable.
Constructs an empty HashMap with the default initial capacity (16) and the default load factor (0.75).
Your hashCode()
is very poor. In example you posted you have 250000 unique values but only 15969 unique hash codes. Because of lot of collisions, Java 8 swaps lists with trees. In your case it only adds overhead, because many elements not only have the same position in hash table but also the same hash code. The tree ends up as a linked list anyway.
There are couple of ways to fix this:
Improve your hashCode. return k1 * 500 + k2;
resolves the issue.
Use THashMap. Open addressing should work better in case of collisions.
Make Node
implement Comparable
. This will be used by HashMap
to construct balanced tree in case of conflicts.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With