Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the order of elements in a HashSet work?

Tags:

java

hashset

I understand that the order of elements in a HashSet is supposed to be arbitrary. But out of curiosity, could anyone tell me exactly how the order is determined?

I noticed that when I insert two elements (say A and B), the order would come out A, B, then re-executing the same code again would give me B, A, then re-excecuting it the third time would give me A, B.

I mean, that's kind of un-deterministic, and a bit weird.

like image 983
One Two Three Avatar asked Feb 15 '23 01:02

One Two Three


2 Answers

The order is determined by the Hashing algorithm used within the Hash Map/Set, the exact settings of that Map and the Hashcodes of the objects.

If your objects have consistent hashcodes over multiple runs (Strings for example) and are placed in the same order into a map with the same settings then in general they would come out in the same order each time. If they don't then they won't.

The source for HashMap can be seen here: http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/HashMap.java

In fact an interesting quote from that source is:

This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.

So not only may the order be different each time your program runs, but in fact the API itself makes no guarantee that the order will remain constant even inside one run of the program!

"Un-deterministic and a bit weird" is a good description of ordering of a HashMap - and is actually pretty much what the docs say. If you want ordering use either LinkedHashMap or TreeMap. If you don't want ordering then don't worry about it, by having the ordering being effectively random HashMap is giving you extremely fast responses from the methods who's behavior it does guarantee!

like image 179
Tim B Avatar answered Feb 16 '23 13:02

Tim B


In principle there are two contributing factors:

  1. Hashcode of your keys might be non-deterministic, this will be the case when you use default hashCode implementation, which relies on memory location

  2. HashSet itself can be non deterministic, take a look at HashMap.initHashSeedAsNeeded (HashSet uses HashMap in standard Oracle SDK as underlying datastructure), depending on some factors it can use sun.misc.Hashing.randomHashSeed(this) to initialize hashSeed field which is then used when computing hashCode of a key

Randomization can be important to achieve probabilistic performance guaranties. This is what javadoc says for hashSeed:

/** * A randomizing value associated with this instance that is applied to
* hash code of keys to make hash collisions harder to find. If 0 then
* alternative hashing is disabled. */

like image 33
hgrey Avatar answered Feb 16 '23 13:02

hgrey