Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I capture only the duplicate elements in an ArrayList?

I need to get all duplicates in my ArrayList. I don't need to remove duplicates, I need to add them to another ArrayList. Here is an example:

ArrayList<String> var = new ArrayList<>();
var.add("a");
var.add("b");
var.add("b");
var.add("c");

So, as you can see, there are 2 duplicate elements (b, and b). I need to add them to another ArrayList.

The resulting ArrayList in this case should be [b,b]. How can I do this?

like image 383
FruthyzGang Avatar asked Dec 08 '22 09:12

FruthyzGang


2 Answers

Approach 1:

Something like this is enough:

    for(String s : var )
        if(Collections.frequency(var, s) > 1)
            duplicates.add(s);

and with streams:

var.stream().filter(s -> frequency(var, s) > 1).collect(toList());

a running example:

public static void main(String[] args) {

    List<String> var = Arrays.asList("a", "b", "b", "c");
    List<String> dup = var.stream().filter(s -> Collections.frequency(var, s) > 1).collect(Collectors.toList());         
    System.out.println(dup);
}

Output:

[b, b]

The idea is as follows, go thought the list, and for each element check their frequency on the list, if they appear more than once add to list of duplicates.

Approach 2:

A less cleaner solution, but with a better time complexity is to use a Map of the string per frequency of that string, and then build the duplicate list base on that Map:

List<String> dup =  new ArrayList<>();
Map<String, Long> frequencies =
        var.stream()
           .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

     for (Map.Entry<String, Long> entry : frequencies.entrySet()){
         for(int i = 0; i < entry.getValue() && entry.getValue() > 1; i++)
             dup.add(entry.getKey());
}

Approach 3:

With linear time complexity O(N):

    Set<String> set = new HashSet <>();
    List<String> duplicates = new ArrayList<>();
    Set<String> is_duplicated = new HashSet <>();
    var.forEach(s -> {
        if(set.contains(s)) {
            is_duplicated.add(s);
            duplicates.add(s);
        }
        else
          set.add(s);
    });
    duplicates.addAll(is_duplicated);
    System.out.println(duplicates);

One can take advantage of the Set.add method semantics, namely:

If this set already contains the element, the call leaves the set
     unchanged and returns {@code false}.

To shorten the above code to only:

    Set<String> set = new HashSet <>();
    List<String> duplicates = new ArrayList<>();
    Set<String> to_add = new HashSet<>();
    var.forEach(s -> {
        if(!set.add(s)) {
            to_add.add(s);
            duplicates.add(s);
        }
    });
    duplicates.addAll(to_add);
    System.out.println(duplicates);  
like image 102
dreamcrash Avatar answered Feb 01 '23 23:02

dreamcrash


You can use a HashMap to store the number of occurrences of the String at each iteration.

Then, for the elements occuring more than once, add them n times to the new List:

List<String> var = new ArrayList<>();
var.add("a");
var.add("b");
var.add("b");
var.add("c");
        
Map<String, Integer> map = new HashMap<>();
for(String str : var) {
    if(map.containsKey(str))
        map.put(str, map.get(str)+1);
    else
        map.put(str, 1);
}
        
List<String> duplicates = new ArrayList<>();
for (String str : var) {
    int count = map.get(str);
    if(count > 1) {
        duplicates.add(str);
    }
}
        
System.out.println(duplicates);

Output:

[b,b]
like image 23
Majed Badawi Avatar answered Feb 01 '23 23:02

Majed Badawi