Fastest Java HashSet library [closed]

Question

In addition to this quite old post, I need something that will use primitives and give a speedup for an application that contains lots of HashSets of Integers:

Set<Integer> set = new HashSet<Integer>();

So people mention libraries like Guava, Javalution, Trove, but there is no perfect comparison of those in terms of benchmarks and performance results, or at least good answer coming from good experience. From what I see many recommend Trove's TIntHashSet, but others say it is not that good; some say Guava is supercool and manageable, but I do not need beauty and maintainability, only time execution, so Python's style Guava goes home :) Javalution? I've visited the website, seems too old for me and thus wacky.

The library should provide the best achievable time, memory does not matter.

Looking at "Thinking in Java", there is an idea of creating custom HashMap with int[] as keys. So I would like to see something similar with a HashSet or simply download and use an amazing library.

EDIT (in response to the comments below) So in my project I start from about 50 HashSet<Integer> collections, then I call a function about 1000 times that inside creates up to 10 HashSet<Integer> collections. If I change initial parameters, the numbers may grow up exponentially. I only use add(), contains() and clear() methods on those collections, that is why they were chosen.

Now I'm going to find a library that implements HashSet or something similar, but will do that faster due to autoboxing Integer overhead and maybe something else which I do not know. In fact, I'm using ints as my data comes in and store them in those HashSets.

Has QUIT--Anony-Mousse · Accepted Answer

Trove is an excellent choice.

The reason why it is much faster than generic collections is memory use.

A java.util.HashSet<Integer> uses a java.util.HashMap<Integer, Integer> internally. In a HashMap, each object is contained in an Entry<Integer, Integer>. These objects take estimated 24 bytes for the Entry + 16 bytes for the actual integer + 4 bytes in the actual hash table. This yields 44 bytes, as opposed to 4 bytes in Trove, an up to 11x memory overhead (note that unoccupied entires in the main table will yield a smaller difference in practise).

Fastest Java HashSet<Integer> library [closed]

Tags:

java

performance

hashset

Sophie Sperner

2 Answers

Has QUIT--Anony-Mousse

cruftex

Recent Activity

Donate For Us