Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

hashmap or hashset?

I have two list containing

List<MyObj>.   

and MyObj has a "String ID" member.

I need to iterate them from time to time and sometimes I need to find objects which are similar on both.
I want a quicker way than lists. so I can know I can use hashMap (than ask contains (String ) when comparing )

should I use hashmap or hashset?

note: in a hashset - I need to do implement my equals and when I run contains() - i think it will be slower than hashmap where upon inserting I put the string id in the key. Am I correct?

like image 706
Bick Avatar asked May 17 '11 08:05

Bick


2 Answers

note: in a hashset - I need to do implement my equals and when I run contains() - i think it will be slower than hashmap where upon inserting I put the string id in the key. Am I correct?

I don't think you would notice any performance difference. HashSet<E> is implemented using a HashMap<E, E> under the hood. So the only difference would be calling MyObj.equals() (which supposedly calls String.equals()) vs calling String.equals() directly. And the JIT compiler is pretty good at inlining calls...

The bottom line is, you should (almost) never worry about micro-optimizations, rather focus on making your design simple and consistent. If your only concern is to avoid duplication and to check for containment, a Set is a more logical choice.

like image 196
Péter Török Avatar answered Sep 26 '22 17:09

Péter Török


This does not really make a difference at all, because when you look at the JDK source code, the Sun implementation of HashSet uses an instance of HashMap internally to store its values:

   public class HashSet<E>
    extends AbstractSet<E>
    implements Set<E>, Cloneable, java.io.Serializable
{
    static final long serialVersionUID = -5024744406713321676L;

    private transient HashMap<E,Object> map;

    // Dummy value to associate with an Object in the backing Map
......

And even if that was not the case, all other answers about that it does not really make a difference from a performance POV apply. The only real difference is that instead of using the equals() and hashCode() implementations of your key class you need to write your own for using the Set - but those could be as simple as delegating to the id field of your class, in case that the id field is the unique identifier.

like image 45
BertNase Avatar answered Sep 26 '22 17:09

BertNase