Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HashSet vs LinkedHashSet

The answer lies in which constructors the LinkedHashSet uses to construct the base class:

public LinkedHashSet(int initialCapacity, float loadFactor) {
    super(initialCapacity, loadFactor, true);      // <-- boolean dummy argument
}

...

public LinkedHashSet(int initialCapacity) {
    super(initialCapacity, .75f, true);            // <-- boolean dummy argument
}

...

public LinkedHashSet() {
    super(16, .75f, true);                         // <-- boolean dummy argument
}

...

public LinkedHashSet(Collection<? extends E> c) {
    super(Math.max(2*c.size(), 11), .75f, true);   // <-- boolean dummy argument
    addAll(c);
}

And (one example of) a HashSet constructor that takes a boolean argument is described, and looks like this:

/**
 * Constructs a new, empty linked hash set.  (This package private
 * constructor is only used by LinkedHashSet.) The backing
 * HashMap instance is a LinkedHashMap with the specified initial
 * capacity and the specified load factor.
 *
 * @param      initialCapacity   the initial capacity of the hash map
 * @param      loadFactor        the load factor of the hash map
 * @param      dummy             ignored (distinguishes this
 *             constructor from other int, float constructor.)
 * @throws     IllegalArgumentException if the initial capacity is less
 *             than zero, or if the load factor is nonpositive
 */
HashSet(int initialCapacity, float loadFactor, boolean dummy) {
    map = new LinkedHashMap<E,Object>(initialCapacity, loadFactor);
}

HashSet is unordered and unsorted Set.
LinkedHashSet is the ordered version of HashSet.

The only difference between HashSet and LinkedHashSet is that:
LinkedHashSet maintains the insertion order.

When we iterate through a HashSet, the order is unpredictable while it is predictable in case of LinkedHashSet.

The reason for how LinkedHashSet maintains insertion order is that:
The underlying used data structure is Doubly-Linked-List.


LinkedHashSet's constructors invoke the following base class constructor:

HashSet(int initialCapacity, float loadFactor, boolean dummy) {
  map = new LinkedHashMap<E, Object>(initialCapacity, loadFactor);
}

As you can see, the internal map is a LinkedHashMap. If you look inside LinkedHashMap, you'll discover the following field:

private transient Entry<K, V> header;

This is the linked list in question.


I suggest you to use LinkedHashSet most of the time, because it has better performance overall):

  1. Predictable iteration order LinkedHashSet (Oracle)
  2. LinkedHashSet is more expensive for insertions than HashSet;
  3. In general slightly better performance than HashMap, because the most of the time we use Set structures for iterating.

Performance tests:

------------- TreeSet -------------
 size       add  contains   iterate
   10       746       173        89
  100       501       264        68
 1000       714       410        69
10000      1975       552        69
------------- HashSet -------------
 size       add  contains   iterate
   10       308        91        94
  100       178        75        73
 1000       216       110        72
10000       711       215       100
---------- LinkedHashSet ----------
 size       add  contains   iterate
   10       350        65        83
  100       270        74        55
 1000       303       111        54
10000      1615       256        58

You can see source test page here: The Final Performance Testing Example