Lets assume we have a Scala list: <pre class="prettyprint"><code>val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1) </code></pre> We can easily remove duplicates using the following code: <pre class="prettyprint"><code>l1.distinct </code></pre> or <pre class="prettyprint"><code>l1.toSet.toList </code></pre> But what if we want to remove duplicates only if there are more than 2 of them? So if there are more than 2 elements with the same value we remain only two and remove the rest of them. I could achieve it with following code: <pre class="prettyprint"><code>l1.groupBy(identity).mapValues(_.take(2)).values.toList.flatten </code></pre> that gave me the result: <pre class="prettyprint"><code>List(2, 2, 5, 1, 1, 3, 3) </code></pre> Elements are removed but the order of remaining elements is different from how these elements appeared in the initial list. How to do this operation and remain the order from original list? So the result for l1 should be: <pre class="prettyprint"><code>List(1, 2, 3, 1, 3, 2, 5) </code></pre>

Not the most efficient. <pre class="prettyprint"><code>scala> val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1) l1: List[Int] = List(1, 2, 3, 1, 1, 3, 2, 5, 1) scala> l1.zipWithIndex.groupBy( _._1 ).map(_._2.take(2)).flatten.toList.sortBy(_._2).unzip._1 res10: List[Int] = List(1, 2, 3, 1, 3, 2, 5) </code></pre>

My humble answer: <pre class="prettyprint"><code>def distinctOrder[A](x:List[A]):List[A] = { @scala.annotation.tailrec def distinctOrderRec(list: List[A], covered: List[A]): List[A] = { (list, covered) match { case (Nil, _) => covered.reverse case (lst, c) if c.count(_ == lst.head) >= 2 => distinctOrderRec(list.tail, covered) case _ => distinctOrderRec(list.tail, list.head :: covered) } } distinctOrderRec(x, Nil) } </code></pre> With the results: <pre class="prettyprint"><code>scala> val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1) l1: List[Int] = List(1, 2, 3, 1, 1, 3, 2, 5, 1) scala> distinctOrder(l1) res1: List[Int] = List(1, 2, 3, 1, 3, 2, 5) </code></pre> On Edit: Right before I went to bed I came up with this! <pre class="prettyprint"><code>l1.foldLeft(List[Int]())((total, next) => if (total.count(_ == next) >= 2) total else total :+ next) </code></pre> With an answer of: <pre class="prettyprint"><code>res9: List[Int] = List(1, 2, 3, 1, 3, 2, 5) </code></pre>

How to remove 2 or more duplicates from list and maintain their initial order?

Tags:

scala

Lets assume we have a Scala list:

val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)

We can easily remove duplicates using the following code:

l1.distinct

l1.toSet.toList

But what if we want to remove duplicates only if there are more than 2 of them? So if there are more than 2 elements with the same value we remain only two and remove the rest of them.

I could achieve it with following code:

l1.groupBy(identity).mapValues(_.take(2)).values.toList.flatten

that gave me the result:

List(2, 2, 5, 1, 1, 3, 3)

Elements are removed but the order of remaining elements is different from how these elements appeared in the initial list. How to do this operation and remain the order from original list?

So the result for l1 should be:

List(1, 2, 3, 1, 3, 2, 5)

946

asked Dec 12 '14 23:12

rtruszk

2 Answers

Not the most efficient.

scala> val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
l1: List[Int] = List(1, 2, 3, 1, 1, 3, 2, 5, 1)

scala> l1.zipWithIndex.groupBy( _._1 ).map(_._2.take(2)).flatten.toList.sortBy(_._2).unzip._1
res10: List[Int] = List(1, 2, 3, 1, 3, 2, 5)

123

answered Sep 27 '22 19:09

Soumya Simanta

My humble answer:

def distinctOrder[A](x:List[A]):List[A] = {
    @scala.annotation.tailrec
    def distinctOrderRec(list: List[A], covered: List[A]): List[A] = {
       (list, covered) match {
         case (Nil, _) => covered.reverse
         case (lst, c) if c.count(_ == lst.head) >= 2 => distinctOrderRec(list.tail, covered)
         case _ =>  distinctOrderRec(list.tail, list.head :: covered)
       }
    }
    distinctOrderRec(x, Nil)
}

With the results:

scala> val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
l1: List[Int] = List(1, 2, 3, 1, 1, 3, 2, 5, 1)

scala> distinctOrder(l1)
res1: List[Int] = List(1, 2, 3, 1, 3, 2, 5)

On Edit: Right before I went to bed I came up with this!

l1.foldLeft(List[Int]())((total, next) => if (total.count(_ == next) >= 2) total else total :+ next)

With an answer of:

res9: List[Int] = List(1, 2, 3, 1, 3, 2, 5)

answered Sep 27 '22 18:09

Daniel Hinojosa

Related questions
                            
                                How to make a jar file from scala
                            
                                Nested iteration in Scala
                            
                                Return type in If expression
                            
                                Loaner Pattern in Scala
                            
                                spark sql window function lag
                            
                                Is nested function efficient?
                            
                                How does Scala Slick translate Scala code into JDBC?
                            
                                Create DataFrame with null value for few column
                            
                                Implicit conversion to Runnable?
                            
                                Simplest way to sort list of objects
                            
                                Extract values from Array into Tuple
                            
                                How to update SBT version using homebrew?
                            
                                Can Spray.io routes be split into multiple "Controllers"?
                            
                                What's the purpose of Function.const?
                            
                                How can I access the last result in Scala REPL?
                            
                                How to wait for Akka actor system to terminate?
                            
                                Scala Compiliation error with intellij
                            
                                Spark dataframe filter
                            
                                Using R from Scala and invoking Scala from R?
                            
                                Best practice for shifting a sequence in a circular manner

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With