Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TreeSet vs HashSet speed for small set size, when O(1) vs O(log n) doesn't matter

I've been reading up on complexity of HashSet vs TreeSet, and everywhere I find topics explaining: "HashSet is faster because it's O(1) instead of O(log n)." Of course, I know this.

However, that only holds if you're working with very large sets. I, on the other hand, need to work with millions of "small" sets, each containing at most 200 objects, and most far less (below 20). The operations on them are very diverse (creating, adding, removing, membership testing, cloning, ...) and hence I'm puzzled on how to best measure/simulate the difference.

Which of both classes will have the least overhead for such small set sizes? Both in terms of speed and of memory overhead. And what about LinkedHashSet?

like image 656
user1111929 Avatar asked Dec 04 '25 04:12

user1111929


1 Answers

and hence I'm puzzled on how to best measure/simulate the difference.

Use a profiler. If the Set operations don't dominate the results (CPU time, memory footprint, allocation rates) then your choice will not make a difference in practice due to amdahl's law..

The biggest advantage of the TreeSet is ordering.

And neither implementation is particularly memory-efficient, there are better Sets out there, depending on which performance measure you care most about. They are wrappers around the corresponding Map implementations and the Maps themselves are not particularly efficient either.

They're more designed for flexibility, providing the vast set of APIs they do than optimizing any particular performance aspect.

like image 71
the8472 Avatar answered Dec 06 '25 21:12

the8472



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!