Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find duplicates in a list?

I have a list of unsorted integers and I want to find those elements which have duplicates.

val dup = List(1,1,1,2,3,4,5,5,6,100,101,101,102) 

I can find the distinct elements of the set with dup.distinct, so I wrote my answer as follows.

val dup = List(1,1,1,2,3,4,5,5,6,100,101,101,102) val distinct = dup.distinct val elementsWithCounts = distinct.map( (a:Int) => (a, dup.count( (b:Int) => a == b )) ) val duplicatesRemoved = elementsWithCounts.filter( (pair: Pair[Int,Int]) => { pair._2 <= 1 } ) val withDuplicates = elementsWithCounts.filter( (pair: Pair[Int,Int]) => { pair._2 > 1 } ) 

Is there an easier way to solve this?

like image 596
Phil Avatar asked Jul 14 '14 04:07

Phil


People also ask

How do you find the number of repeated values in a list in Python?

Operator. countOf() is used for counting the number of occurrences of b in a. It counts the number of occurrences of value. It returns the Count of a number of occurrences of value.


2 Answers

Try this:

val dup = List(1,1,1,2,3,4,5,5,6,100,101,101,102) dup.groupBy(identity).collect { case (x, List(_,_,_*)) => x } 

The groupBy associates each distinct integer with a list of its occurrences. The collect is basically map where non-matching elements are ignored. The match pattern following case will match integers x that are associated with a list that fits the pattern List(_,_,_*), a list with at least two elements, each represented by an underscore since we don't actually need to store those values (and those two elements can be followed by zero or more elements: _*).

You could also do:

dup.groupBy(identity).collect { case (x,ys) if ys.lengthCompare(1) > 0 => x } 

It's much faster than the approach you provided since it doesn't have to repeatedly pass over the data.

like image 166
dhg Avatar answered Sep 27 '22 23:09

dhg


A bit late to the party, but here's another approach:

dup.diff(dup.distinct).distinct 

diff gives you all the extra items above those in the argument (dup.distinct), which are the duplicates.

like image 39
Luigi Plinge Avatar answered Sep 28 '22 01:09

Luigi Plinge