Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should you check for a duplicate before inserting into a set

I am learning to use sets. My question is : Sets do not contain duplicates. When we try to insert duplicates, it does not throw any error and automatically removes duplicates. Is it a good practice to check each value before inserting into set whether it exists or not? Or is it OK to do something like the below code? I think Java would be internally doing the check using .contains(value) . What do you think?

What would be the Big O complexity in both the cases considering there are n elements going into the set?

import java.util.HashSet;
import java.util.Set;

public class DuplicateTest {

    public static void main(String[] args) {
        // TODO Auto-generated method stub
         Set<Integer> mySet = new HashSet<Integer>();

         mySet.add(10);
         mySet.add(20);
         mySet.add(30);
         mySet.add(40);
         mySet.add(50);
         mySet.add(50);
         mySet.add(50);
         mySet.add(50);
         mySet.add(50);
         mySet.add(50);

         System.out.println("Contents of the Hash Set :"+mySet);
    }

}
like image 548
Zack Avatar asked Jan 10 '16 23:01

Zack


People also ask

Can we insert duplicate values in Set?

A Set is a Collection that cannot contain duplicate elements. It models the mathematical set abstraction. The Set interface contains only methods inherited from Collection and adds the restriction that duplicate elements are prohibited.

What happens if we insert duplicate elements into the Set?

If we insert duplicate values to the Set, we don't get any compile time or run time errors. It doesn't add duplicate values in the set. Below is the add() method of the set interface in java collection that returns Boolean value either TRUE or FALSE when the object is already present in the set.

How duplicates are avoided in Set?

for the first time add SOP will return true. then for the second time of added "123", SOP will return false. With this we can understand that, if we add same value in the Set for the second time or more, then the duplicated value will be skipped.

How does a Set know if there's a duplicate?

One more way to detect duplication in the java array is adding every element of the array into HashSet which is a Set implementation. Since the add(Object obj) method of Set returns false if Set already contains an element to be added, it can be used to find out if the array contains duplicates in Java or not.


3 Answers

As per the docs:

public boolean add(E e)

Adds the specified element to this set if it is not already present. More formally, adds the specified element e to this set if this set contains no element e2 such that (e==null ? e2==null : e.equals(e2)). If this set already contains the element, the call leaves the set unchanged and returns false.

So the add() method already returns you a true or a false. So you don't need to do the additional check.

like image 193
Atri Avatar answered Sep 25 '22 02:09

Atri


Compare with the API documentation of Set.add(E)

The add method checks if the element is already in the Set. If the element is already present, then the new element is not added, and the Set remains unchanged. In most situations, you don't need to check anything.

The complexity of the method depends of the concrete implementation of Set that you are using.

like image 35
Alejandro Goñi Avatar answered Sep 24 '22 02:09

Alejandro Goñi


Its ok not to check. This is the main advantage over Sets of Lists, as they will automatically filter out duplicates.

HashSet has constant time performance (http://docs.oracle.com/javase/8/docs/api/java/util/HashSet.html)

This class offers constant time performance for the basic operations (add, remove, contains and size), assuming the hash function disperses the elements properly among the buckets

like image 36
DMozzy Avatar answered Sep 26 '22 02:09

DMozzy