I have updated a Java application to Java 8. The application heavily relies on HashMaps. When I run the benchmarks, I see unpredictable behavoir. For some inputs, the application runs faster than before, but for larger inputs, it's constantly slower. I've checked the profiler and the most time consuming operation is HashMap.get. I suspect the changes are due to the HashMap modification in Java 8, but it may not be true, as I have changed some other parts. Is there an easy way that I hook in the original Java 7 HashMap into my Java 8 application so that I only change the hashmap implementation to see if I still observe the change in performance. The following is a minimal program that tries to simulate what my application is doing. The basic idea is that i need to share nodes in the application. At some runtime point, a node should be retrieved or created if it already does not exist based on some integer properties. The following only uses two integer, but in the real application I have one, two and three integer keys. <pre class="prettyprint"><code>import java.util.HashMap; import java.util.Map; import java.util.Random; public class Test1 { static int max_k1 = 500; static int max_k2 = 500; static Map<Node, Node> map; static Random random = new Random(); public static void main(String[] args) { for (int i = 0; i < 15; i++) { long start = System.nanoTime(); run(); long end = System.nanoTime(); System.out.println((end - start) / 1000_000); } } private static void run() { map = new HashMap<>(); for (int i = 0; i < 10_000_000; i++) { Node key = new Node(random.nextInt(max_k1), random.nextInt(max_k2)); Node val = getOrElseUpdate(key); } } private static Node getOrElseUpdate(Node key) { Node val; if ((val = map.get(key)) == null) { val = key; map.put(key, val); } return val; } private static class Node { private int k1; private int k2; public Node(int k1, int k2) { this.k1 = k1; this.k2 = k2; } @Override public int hashCode() { int result = 17; result = 31 * result + k1; result = 31 * result + k2; return result; } @Override public boolean equals(Object obj) { if (this == obj) return true; if (!(obj instanceof Node)) return false; Node other = (Node) obj; return k1 == other.k1 && k2 == other.k2; } } } </code></pre> The benchmarking is primitive, but still, this is the result of 15 runs on Java 8: <pre class="prettyprint"><code>8143 7919 7984 7973 7948 7984 7931 7992 8038 7975 7924 7995 6903 7758 7627 </code></pre> and this is for Java 7: <pre class="prettyprint"><code>7247 6955 6510 6514 6577 6489 6510 6570 6497 6482 6540 6462 6514 4603 6270 </code></pre> The benchmarking is primitive, so I appreciate if someone who is familiar with JMH or other benchmarking tools run it, but from what I observe the results are better for Java 7. Any ideas?

Your <code>hashCode()</code> is very poor. In example you posted you have 250000 unique values but only 15969 unique hash codes. Because of lot of collisions, Java 8 swaps lists with trees. In your case it only adds overhead, because many elements not only have the same position in hash table but also the same hash code. The tree ends up as a linked list anyway. There are couple of ways to fix this: <ol> <li>Improve your hashCode. <code>return k1 * 500 + k2;</code> resolves the issue. </li> <li>Use THashMap. Open addressing should work better in case of collisions.</li> <li>Make <code>Node</code> implement <code>Comparable</code>. This will be used by <code>HashMap</code> to construct balanced tree in case of conflicts.</li> </ol>

Using Java 7 HashMap in Java 8

I have updated a Java application to Java 8. The application heavily relies on HashMaps. When I run the benchmarks, I see unpredictable behavoir. For some inputs, the application runs faster than before, but for larger inputs, it's constantly slower.

I've checked the profiler and the most time consuming operation is HashMap.get. I suspect the changes are due to the HashMap modification in Java 8, but it may not be true, as I have changed some other parts.

Is there an easy way that I hook in the original Java 7 HashMap into my Java 8 application so that I only change the hashmap implementation to see if I still observe the change in performance.

The following is a minimal program that tries to simulate what my application is doing. The basic idea is that i need to share nodes in the application. At some runtime point, a node should be retrieved or created if it already does not exist based on some integer properties. The following only uses two integer, but in the real application I have one, two and three integer keys.

import java.util.HashMap;
import java.util.Map;
import java.util.Random;

public class Test1 {

static int max_k1 = 500;
static int max_k2 = 500;

static Map<Node, Node> map;
static Random random = new Random();

public static void main(String[] args) {
    for (int i = 0; i < 15; i++) {
        long start = System.nanoTime();
        run();
        long end = System.nanoTime();
        System.out.println((end - start) / 1000_000);
    }
}

private static void run() {
    map = new HashMap<>();
    for (int i = 0; i < 10_000_000; i++) {
        Node key = new Node(random.nextInt(max_k1), random.nextInt(max_k2));
        Node val = getOrElseUpdate(key);
    }
}

private static Node getOrElseUpdate(Node key) {
    Node val;
    if ((val = map.get(key)) == null) {
        val = key;
        map.put(key, val);
    }
    return val;
}

private static class Node {

    private int k1;
    private int k2;

    public Node(int k1, int k2) {
        this.k1 = k1;
        this.k2 = k2;
    }

    @Override
    public int hashCode() {
        int result = 17;
        result = 31 * result + k1;
        result = 31 * result + k2;
        return result;
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;

        if (!(obj instanceof Node))
            return false;

        Node other = (Node) obj;

        return k1 == other.k1 && k2 == other.k2;
    }
  }
}

The benchmarking is primitive, but still, this is the result of 15 runs on Java 8:

and this is for Java 7:

The benchmarking is primitive, so I appreciate if someone who is familiar with JMH or other benchmarking tools run it, but from what I observe the results are better for Java 7. Any ideas?

How does HashMap & Collections differs from Java 7 & 8?

In Java 7 after calculating hash from hash function if more then one element has same hash than they are searched by linear search so it's complexity is (n). In Java 8 that search is performed by binary search so the complexity will become log(n).

How HashMap works internally in Java 8 with example?

In Java 8, HashMap replaces linked list with a binary tree when the number of elements in a bucket reaches certain threshold. While converting the list to binary tree, hashcode is used as a branching variable.

What is default size of HashMap?

Constructs an empty HashMap with the default initial capacity (16) and the default load factor (0.75).

Your hashCode() is very poor. In example you posted you have 250000 unique values but only 15969 unique hash codes. Because of lot of collisions, Java 8 swaps lists with trees. In your case it only adds overhead, because many elements not only have the same position in hash table but also the same hash code. The tree ends up as a linked list anyway.

There are couple of ways to fix this:

Improve your hashCode. return k1 * 500 + k2; resolves the issue.
Use THashMap. Open addressing should work better in case of collisions.
Make Node implement Comparable. This will be used by HashMap to construct balanced tree in case of conflicts.

Using Java 7 HashMap in Java 8

Tags:

java

performance

hashmap

Wickoo

People also ask

1 Answers

Piotr Praszmo

Recent Activity

Donate For Us

Using Java 7 HashMap in Java 8

Tags:

java

performance

hashmap

Wickoo

People also ask

1 Answers

Piotr Praszmo

Related questions

Recent Activity

Donate For Us