Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HashMap vs LinkedHashMap performance in iteration over values()

Is there any performance difference between HashMap and LinkedHashMap for traversal through values() function?

like image 744
MBZ Avatar asked Oct 21 '12 14:10

MBZ


People also ask

Is HashMap faster than LinkedHashMap?

LinkedHashMap can also maintain its elements in access order meaning the order in which the entries are accessed. As with LinkedHashMap, a doubly-linked list has to be maintained, it has less performance than HashMap.

Which is better LinkedHashMap or HashMap?

The Major Difference between the HashMap and LinkedHashMap is the ordering of the elements. The LinkedHashMap provides a way to order and trace the elements. Comparatively, the HashMap does not support the ordering of the elements.

Which offers the best performance TreeMap HashMap LinkedHashMap?

That is, if you need to get the keys back in insertion order, then use LinkedHashMap. If you need to get the keys back in their true/natural order, then use TreeMap. Otherwise, HashMap is probably best. It is typically faster and requires less overhead.


7 Answers

I think the LinkedHashMap has to be faster in traversal due to a superior nextEntry implementation in its Iterator

Here is why :

Let us go step by step from the values implementation.
The HashMap implementation of values is this :

public Collection<V> values() {
    Collection<V> vs = values;
    return (vs != null ? vs : (values = new Values()));
}

The LinkedHashMap extends from HashMap and inherits the same implementation.

The difference is in the Iterator implementation for the Values in both.

for HashMap it extends from java.util.HashMap.HashIterator

private final class ValueIterator extends HashIterator<V> {
    public V next() {
        return nextEntry().value;
    }
}

but for LinkedHashMap it extends from java.util.LinkedHashMap.LinkedHashIterator

private class ValueIterator extends LinkedHashIterator<V> {
    public V next() { return nextEntry().value; }
}

so the difference essentially boils down to nextEntry implementation.

For LinkedHashMap it is just calling e.after where e is the Entry , but for HashMap there is some work involved in traversing the Entry[] array to find the next next.

UPDATE : Code for nextEntry() in HashMap

final Entry<K,V> nextEntry() {
    if (modCount != expectedModCount)
        throw new ConcurrentModificationException();
    Entry<K,V> e = next;
    if (e == null)
        throw new NoSuchElementException();

    if ((next = e.next) == null) {
        Entry[] t = table;
        while (index < t.length && (next = t[index++]) == null)
            ;
    }
    current = e;
    return e;
}

The Entry[] is not a contiguous store. (There could be null values in between). If you take a look at the above code, what it does is point next to current and find the next next by iterating over the Entry[] .

But I think this performance gain will come at the cost of insertion. Check out the addEntry method in both classes as an exercise.

like image 171
Ajay George Avatar answered Sep 30 '22 22:09

Ajay George


I wrote a little profiling program creating 1 million keys (Integer) vs Boolean.TRUE, repeating 100 times. Found the following:

HashMap:-
Create:  3.7sec
Iterate: 1.1sec
Access:  1.5sec
Total:   6.2sec

LinkedHashMap:-
Create:  4.7sec   (30% slower)
Iterate: 0.5sec   (50% faster)
Access:  0.8sec   (50% faster)
Total :  6.0sec

Garbage collection NOT done so pollutes the numbers somewhat, however I think LinkedHashMap has the edge over HashMap and I will be using that in future code.

like image 45
user1016765 Avatar answered Sep 30 '22 21:09

user1016765


It almost does not matter. The question is: what do you need. If order of elements is relevant you have to use LinkedHashMap. Otherwise you just do not need it, so use HashMap.

like image 39
AlexR Avatar answered Sep 30 '22 22:09

AlexR


I tried in an UnitTest, iterated values() 10000 times, the milliseconds: 806 vs 902. It is almost the same.

like image 41
hongtium Avatar answered Sep 30 '22 22:09

hongtium


Yes, there will be the same performance difference as you get in all iterations over HashMap versus LinkedHashMap: HashMap will take time proportional to the number of entries plus the size of the hash table, and LinkedHashMap will just take time proportional to the number of entries.

like image 28
Louis Wasserman Avatar answered Sep 30 '22 21:09

Louis Wasserman


The best advice would be "Don't be afraid to try it out" but I'm quite sure they are very similar. Getter for the value set is O(1) and so is each iterator step. Iterating through a linked list is as trivial as iterating through the hash buckets, with a possible small edge in favor of the linked list.

like image 29
Marko Topolnik Avatar answered Sep 30 '22 22:09

Marko Topolnik


Code...

import java.lang.management.GarbageCollectorMXBean;
import java.lang.management.ManagementFactory;
import java.util.HashMap;
import java.util.HashSet;
import java.util.LinkedHashMap;
import java.util.Map;

public class MapTest
{
    public static void main(String[] args)
    {
        int iterations = 1000000;

        long time1, time2, time3;
        System.nanoTime(); //Just to make sure any possible overhead is done...though there shouldn't really be any

        int sequential[] = new int[iterations]; //Counting up from 0
        int random[] = new int[iterations]; //Same set of values, but randomized (no duplicates)
        HashSet<Integer> addedRandoms = new HashSet<>();
        for (int i = 0; i < iterations; i++)
        {
            sequential[i] = i;

            int randomVal = random(iterations);
            while (addedRandoms.contains(randomVal)) randomVal = random(iterations); //Get another random instead of sequentially finding another unused value, to prevent clumping
            addedRandoms.add(randomVal);
            random[i] = random(iterations);
        }

        HashMap<Integer, Integer> hashMap = new HashMap<>();
        LinkedHashMap<Integer, Integer> linkedHashMap = new LinkedHashMap<>();


        int gcRuns = 0, prevGCRuns;
        for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) gcRuns += gcBean.getCollectionCount();
        prevGCRuns = gcRuns;


        //Test
        time1 = System.nanoTime();
        for (int i : sequential) hashMap.put(i, 0);
        time2 = System.nanoTime();
        for (int i : sequential) linkedHashMap.put(i, 0);
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("Put: sequential key (from 0 to " + (iterations - 1) + "), no overwrites", time1, time2, time3, prevGCRuns);


        //Test
        time1 = System.nanoTime();
        for (int i : random) hashMap.put(i, 0);
        time2 = System.nanoTime();
        for (int i : random) linkedHashMap.put(i, 0);
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("Put: random key (between 0 and " + (iterations - 1) + " inclusive), all overwrites (exactly one per entry, random order)", time1, time2, time3, prevGCRuns);


        //Attempt GC
        System.gc();
        prevGCRuns = 0;
        for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) prevGCRuns += gcBean.getCollectionCount();


        //Test
        time1 = System.nanoTime();
        for (int i : sequential) hashMap.put(i, 0);
        time2 = System.nanoTime();
        for (int i : sequential) linkedHashMap.put(i, 0);
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("Put: sequential key (from 0 to " + (iterations - 1) + "), all overwrites (exactly one per entry, sequential order)", time1, time2, time3, prevGCRuns);


        //Empty maps
        hashMap = new HashMap<>();
        linkedHashMap = new LinkedHashMap<>();


        //Attempt GC
        System.gc();
        prevGCRuns = 0;
        for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) prevGCRuns += gcBean.getCollectionCount();


        //Test
        time1 = System.nanoTime();
        for (int i : random) hashMap.put(i, 0);
        time2 = System.nanoTime();
        for (int i : random) linkedHashMap.put(i, 0);
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("Put: random key (between 0 and " + (iterations - 1) + " inclusive), no overwrites", time1, time2, time3, prevGCRuns);


        //Test
        time1 = System.nanoTime();
        for (int i : sequential) hashMap.get(i);
        time2 = System.nanoTime();
        for (int i : sequential) linkedHashMap.get(i);
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("Sequential get, randomized internal keys", time1, time2, time3, prevGCRuns);


        //Test
        time1 = System.nanoTime();
        for (int i : random) hashMap.get(i);
        time2 = System.nanoTime();
        for (int i : random) linkedHashMap.get(i);
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("Random get, randomized internal keys", time1, time2, time3, prevGCRuns);


        //Set sequential keys
        hashMap = new HashMap<>();
        linkedHashMap = new LinkedHashMap<>();
        for (int i : sequential)
        {
            hashMap.put(i, 0);
            linkedHashMap.put(i, 0);
        }


        //Attempt GC
        System.gc();
        prevGCRuns = 0;
        for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) prevGCRuns += gcBean.getCollectionCount();


        //Test
        time1 = System.nanoTime();
        for (int i : sequential) hashMap.get(i);
        time2 = System.nanoTime();
        for (int i : sequential) linkedHashMap.get(i);
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("Sequential get, sequential internal keys", time1, time2, time3, prevGCRuns);


        //Test
        time1 = System.nanoTime();
        for (int i : random) hashMap.get(i);
        time2 = System.nanoTime();
        for (int i : random) linkedHashMap.get(i);
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("Random get, sequential internal keys", time1, time2, time3, prevGCRuns);


        //Test
        time1 = System.nanoTime();
        for (int i : hashMap.values()) ;
        time2 = System.nanoTime();
        for (int i : linkedHashMap.values()) ;
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("values() iteration, sequential internal keys", time1, time2, time3, prevGCRuns);


        //Set random keys
        hashMap = new HashMap<>();
        linkedHashMap = new LinkedHashMap<>();
        for (int i : random)
        {
            hashMap.put(i, 0);
            linkedHashMap.put(i, 0);
        }


        //Attempt GC
        System.gc();
        prevGCRuns = 0;
        for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) prevGCRuns += gcBean.getCollectionCount();


        //Test
        time1 = System.nanoTime();
        for (int i : hashMap.values()) ;
        time2 = System.nanoTime();
        for (int i : linkedHashMap.values()) ;
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("values() iteration, randomized internal keys", time1, time2, time3, prevGCRuns);


        //Test
        time1 = System.nanoTime();
        for (int i : hashMap.keySet()) ;
        time2 = System.nanoTime();
        for (int i : linkedHashMap.keySet()) ;
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("keySet() iteration, randomized internal keys", time1, time2, time3, prevGCRuns);


        //Set sequential keys
        hashMap = new HashMap<>();
        linkedHashMap = new LinkedHashMap<>();
        for (int i : sequential)
        {
            hashMap.put(i, 0);
            linkedHashMap.put(i, 0);
        }


        //Attempt GC
        System.gc();
        prevGCRuns = 0;
        for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) prevGCRuns += gcBean.getCollectionCount();


        //Test
        time1 = System.nanoTime();
        for (int i : hashMap.keySet()) ;
        time2 = System.nanoTime();
        for (int i : linkedHashMap.keySet()) ;
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("keySet() iteration, sequential internal keys", time1, time2, time3, prevGCRuns);


        //Test
        time1 = System.nanoTime();
        for (Map.Entry<Integer, Integer> entry : hashMap.entrySet()) ;
        time2 = System.nanoTime();
        for (Map.Entry<Integer, Integer> entry : linkedHashMap.entrySet()) ;
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("entrySet() iteration, sequential internal keys", time1, time2, time3, prevGCRuns);


        //Set random keys
        hashMap = new HashMap<>();
        linkedHashMap = new LinkedHashMap<>();
        for (int i : random)
        {
            hashMap.put(i, 0);
            linkedHashMap.put(i, 0);
        }


        //Attempt GC
        System.gc();
        prevGCRuns = 0;
        for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) prevGCRuns += gcBean.getCollectionCount();


        //Test
        time1 = System.nanoTime();
        for (Map.Entry<Integer, Integer> entry : hashMap.entrySet()) ;
        time2 = System.nanoTime();
        for (Map.Entry<Integer, Integer> entry : linkedHashMap.entrySet()) ;
        time3 = System.nanoTime();

        prevGCRuns = printAndReset("entrySet() iteration, randomized internal keys", time1, time2, time3, prevGCRuns);
    }


    protected static int printAndReset(String description, long time1, long time2, long time3, int prevGCRuns)
    {
        System.out.println(description);
        System.out.println("HashMap: " + (time2 - time1) + " nanos");
        System.out.println("LinkedHashMap: " + (time3 - time2) + " nanos");
        int gcRuns = 0;
        for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) gcRuns += gcBean.getCollectionCount();
        System.out.println("GC during test: " + (gcRuns != prevGCRuns));
        System.out.println();

        return gcRuns;
    }


    public static int random(int maxvalue)
    {
        return (int) ((double) maxvalue * Math.random());
    }
}

Output...

Put: sequential key (from 0 to 999999), no overwrites
HashMap: 30190070 nanos
LinkedHashMap: 44618672 nanos
GC during test: false

Put: random key (between 0 and 999999 inclusive), all overwrites (exactly one per entry, random order)
HashMap: 118536111 nanos
LinkedHashMap: 110828524 nanos
GC during test: false

Put: sequential key (from 0 to 999999), all overwrites (exactly one per entry, sequential order)
HashMap: 25070771 nanos
LinkedHashMap: 23569874 nanos
GC during test: false

Put: random key (between 0 and 999999 inclusive), no overwrites
HashMap: 93353708 nanos
LinkedHashMap: 106686445 nanos
GC during test: false

Sequential get, randomized internal keys
HashMap: 38817600 nanos
LinkedHashMap: 39499373 nanos
GC during test: false

Random get, randomized internal keys
HashMap: 42636179 nanos
LinkedHashMap: 51062653 nanos
GC during test: false

Sequential get, sequential internal keys
HashMap: 19986540 nanos
LinkedHashMap: 19502021 nanos
GC during test: false

Random get, sequential internal keys
HashMap: 58083205 nanos
LinkedHashMap: 59547574 nanos
GC during test: false

values() iteration, sequential internal keys
HashMap: 21284921 nanos
LinkedHashMap: 18383069 nanos
GC during test: false

values() iteration, randomized internal keys
HashMap: 19616868 nanos
LinkedHashMap: 15392964 nanos
GC during test: false

keySet() iteration, randomized internal keys
HashMap: 18054895 nanos
LinkedHashMap: 16067725 nanos
GC during test: false

keySet() iteration, sequential internal keys
HashMap: 18764430 nanos
LinkedHashMap: 18604873 nanos
GC during test: false

entrySet() iteration, sequential internal keys
HashMap: 18493825 nanos
LinkedHashMap: 18067752 nanos
GC during test: false

entrySet() iteration, randomized internal keys
HashMap: 16252707 nanos
LinkedHashMap: 13175517 nanos
GC during test: false
like image 32
Laike Endaril Avatar answered Sep 30 '22 22:09

Laike Endaril