I've always loved trees, that nice O(n*log(n))
and the tidiness of them. However, every software engineer I've ever known has asked me pointedly why I would use a TreeSet
. From a CS background, I don't think it matters all that much which you use, and I don't care to mess around with hash functions and buckets (in the case of Java
).
In which cases should I use a HashSet
over a TreeSet
?
Simply put, HashSet is faster than the TreeSet. HashSet provides constant-time performance for most operations like add(), remove() and contains(), versus the log(n) time offered by the TreeSet.
Hash set and tree set both belong to the collection framework. HashSet is the implementation of the Set interface whereas Tree set implements sorted set. Tree set is backed by TreeMap while HashSet is backed by a hashmap.
Ordering TreeSet maintains objects in Sorted order defined by either Comparable or Comparator method in Java. TreeSet elements are sorted in ascending order by default. It offers several methods to deal with the ordered set like first(), last(), headSet(), tailSet(), etc.
According to one post, the more items are in a List, the slower it gets, but a HashSet is roughly the same speed regardless. However at the very lowest point, a List is actually faster, and since I will only ever need 2 elements, List would be the more performant one.
HashSet is much faster than TreeSet (constant-time versus log-time for most operations like add, remove and contains) but offers no ordering guarantees like TreeSet.
SortedSet
)first()
, last()
, headSet()
, and tailSet()
etcHashSet
and TreeSet
. Implemented as a hash table with a linked list running through it, however,it provides insertion-ordered iteration which is not same as sorted traversal guaranteed by TreeSet.So a choice of usage depends entirely on your needs but I feel that even if you need an ordered collection then you should still prefer HashSet to create the Set and then convert it into TreeSet.
SortedSet<String> s = new TreeSet<String>(hashSet);
One advantage not yet mentioned of a TreeSet
is that its has greater "locality", which is shorthand for saying (1) if two entries are nearby in the order, a TreeSet
places them near each other in the data structure, and hence in memory; and (2) this placement takes advantage of the principle of locality, which says that similar data is often accessed by an application with similar frequency.
This is in contrast to a HashSet
, which spreads the entries all over memory, no matter what their keys are.
When the latency cost of reading from a hard drive is thousands of times the cost of reading from cache or RAM, and when the data really is accessed with locality, the TreeSet
can be a much better choice.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With