Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove duplicate element from the set in java [duplicate]

Tags:

java

set

I have set of string array and i want to remove duplicate elements from this...

    String[] arr1 = {"a1","b1"};
    String[] arr2 = {"a2","b2"};
    Set<String[]> mySet = new HashSet<String[]>();
    mySet.add(arr1);
    mySet.add(arr2);
    mySet.add(new String[] {"a1","b1"});
    System.out.print(mySet.size());

Currently mySet looks like this:

[{"a1","b1"},{"a2","b2"},{"a1","b1"}]

But I want like this:

[{"a1","b1"},{"a2","b2"}]

I know some ways...

  1. Every time I need to run inner loop and check whether its duplicate or not.
  2. Can I override the set's behavior? (hashcode or equals)? ( i do not know how....)
  3. Do I need to change data structure for this? (linkedhashset or list or any other suitable data structure for this?)
like image 400
Manan Shah Avatar asked Dec 17 '13 05:12

Manan Shah


People also ask

Does HashSet remove duplicates?

As we know that the HashSet contains only unique elements, ie no duplicate entries are allowed, and since our aim is to remove the duplicate entries from the collection, so for removing all the duplicate entries from the collection, we will use HashSet.

How does Set avoid duplicates in Java?

Each and every element in the set is unique . So that there is no duplicate element in set .

Can Set take duplicate values in Java?

A Set is a Collection that cannot contain duplicate elements. It models the mathematical set abstraction. The Set interface contains only methods inherited from Collection and adds the restriction that duplicate elements are prohibited.


3 Answers

Arrays inherit from Object and don't override the hashCode and equals methods. A HashSet uses a Map implementation, which in turn, uses hashCode and equals to avoid duplicate elements.

You can use a TreeSet with a custom Comparator that compares the String arrays for equality.

Set<String[]> mySet = new TreeSet<>(new Comparator<String[]>() {

  @Override
  public int compare(String[] o1, String[] o2) {
    return Arrays.equals(o1, o2)? 0 : Arrays.hashCode(o1) - Arrays.hashCode(o2);
  }

});

Note that this will only neglect duplicate arrays with the same corresponding elements. If the order of elements is different, it won't be considered as a duplicate.

If you want to be able to discard unordered duplicates, for e.g., {a1, b1} and {b1, a1}, use this:

@Override
public int compare(String[] o1, String[] o2) {
    int comparedHash = o1.hashCode() - o2.hashCode();
    if(o1.length != o2.length) return comparedHash;
    List<String> list = Arrays.asList(o1);
    for(String s : o2) {
        if(!list.contains(s)) return comparedHash;
    }
    return 0;
}
like image 78
c.P.u1 Avatar answered Oct 22 '22 12:10

c.P.u1


The array hashcode is independent of the contents of the array (it inherits the Object hashcode, which uses the array's reference).

However, List would do what you want. It uses a hashcode based on the elements in the List . From Java Docs:

int hashCode = 1;
for (E e : list)
    hashCode = 31*hashCode + (e==null ? 0 : e.hashCode());

Example:

List<String> list1 = Arrays.asList("a1","b1");
List<String> list2 = Arrays.asList("a2","b2");
Set<List<String>> mySet = new HashSet<List<String>>();
mySet.add(list1);
mySet.add(list2);
mySet.add(Arrays.asList("a1","b1"));   // duplicate won't be added
System.out.print(mySet.size());        // size = 2
like image 43
bcorso Avatar answered Oct 22 '22 10:10

bcorso


Arrays uses identity-based Object.hashCode() implementation and there is no easy way to check if they are equal. If it all you still want to go ahead with your problem I would suggest you to use TreeSet with Comparator

Though not fail proof approach, but you should be able to build fine tuned solution out of my example,

public static void main(String[] args) {
          String[] arr1 = {"a1","b1"};
            String[] arr2 = {"a2","b2"};
            Set<String[]> mySet = new TreeSet<String[]>(new ArrayComparator());
            mySet.add(arr1);
            mySet.add(arr2);
            mySet.add(new String[] {"a1","b1"});
            System.out.println(mySet.size());
            for(String[] aa: mySet){
                System.out.println(aa[0]+" , "+aa[1]);
            }
    }
}

 class ArrayComparator implements Comparator {

    @Override
    public int compare(Object o1, Object o2) {
        String[] ar1 =(String[]) o1;
        String[] ar2 =(String[]) o2;
        if(ar1.length!=ar2.length){
            return -1;
        }
        for(int count=0;count<ar1.length;count++){
            if(!ar1[count].equals(ar2[count])){
                return -1;
            }
        }
        return 0;
    }
like image 3
Satheesh Cheveri Avatar answered Oct 22 '22 12:10

Satheesh Cheveri