Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding arrays with same values to HashSet results in duplicate items

I'm trying to create a set of arrays of ints, the thing is that if I try to do:

HashSet<int[]> s = new HashSet<int[]>();
int a1[] = {1,2,3};
int a2[] = {1,2,3};
s.add(a1);
s.add(a2)
System.out.println(s.size());

Then s has two objects, but there should be only one. Note: it doesn't matter if it is HashSet< Integer[]>. It just doesn't work.

Now If I try to do this with an ArrayList< Integer>, something like:

HashSet<ArrayList<Integer>> s = new HashSet<ArrayList<Integer>>();
ArrayList<Integer> a1 = new ArrayList<Integer>();
ArrayList<Integer> a2 = new ArrayList<Integer>();
a1.add(1);
a1.add(2);
a1.add(3);

a2.add(1);
a2.add(2);
a2.add(3);

s.add(a1);
s.add(a2)
System.out.println(s.size());

Then s has one object.

I though a way to avoid the error in the first code and was storing hashcodes of each array in a hashset as follows:

int a1[] = {0,10083,10084,1,0,1,10083,0,0,0,0};
int a2[] = {1 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,1 ,0,2112};
HashSet<Integer> s= new HashSet<Integer>();//hashcodes of each array
s.add(Arrays.hashCode(a1));
s.add(Arrays.hashCode(a2));
System.out.println(Arrays.hashCode(a1));
System.out.println(Arrays.hashCode(a2));
System.out.println(s.size());

It works for the first case(1,2,3) but in cases where there are collisions it doesn't work, so I would have to manage collisions. So, I think that what I am doing is implementing a HashSet by myself.

With HashSet< ArrayList< Integer>> it works perfectly. I suppose that java manage collisions in that case.

My question is why java does not allow to manage a HashSet< int[]> or HashSet< Integer[]> if hashcodes generated are the same as in ArrayList< Integer> and hashcodes of arrays can be calculated simply by calling Arrays.hashCode(...).

And finally, if I want to do a HashSet< int[]>(or HashSet< Integer[]>) I would have to implement it by myself? Or there is a better way to do it?

Thanks.

UPDATE: Ok, finally I think I have came to a complete answer. As @ZiyaoWei and @user1676075 commented it doesn't work because equals returns false and hashcodes are differents. But, why java does not override this methods(with Arrays.equals(), Arrays.hashCode()) so one can do something like HashSet< int[]>? The answer is because an array is a mutable object, and hashcode can not depend on mutable values(each element of array is a mutable value) according to the general contract of hashcode. Mutable objects and hashCode

Here nice explanations of using mutable fields in hashCode http://blog.mgm-tp.com/2012/03/hashset-java-puzzler/ and mutable keys in hashmaps Are mutable hashmap keys a dangerous practice?

My answer is, if you want to use a HashSet< int[]> you have to create a class that has an array and if you want that hashcode and equals to depend on values, override methods equals() and hashCode() with Arrays.equals() and Arrays.hashCode(). If you don't want to violate the contract just make the array final.

Thanks everyone!

like image 917
voodoo14 Avatar asked May 20 '13 20:05

voodoo14


1 Answers

It has nothing to do with collision at the end of the day:

a1.equals(a2) == false

Since they are not equal, a Set will treat them as different.

Note Array in Java does not override the equals method from Object.

And since add in Set is defined as

More formally, adds the specified element e to this set if the set contains no element e2 such that (e==null ? e2==null : e.equals(e2))

is seems to be impossible to properly implement a Set that might meet your requirement (compare elements with Arrays.equals) without violating some contracts.

like image 199
zw324 Avatar answered Sep 20 '22 12:09

zw324