Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove 2 or more duplicates from list and maintain their initial order?

Tags:

scala

Lets assume we have a Scala list:

val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)

We can easily remove duplicates using the following code:

l1.distinct

or

l1.toSet.toList

But what if we want to remove duplicates only if there are more than 2 of them? So if there are more than 2 elements with the same value we remain only two and remove the rest of them.

I could achieve it with following code:

l1.groupBy(identity).mapValues(_.take(2)).values.toList.flatten

that gave me the result:

List(2, 2, 5, 1, 1, 3, 3)

Elements are removed but the order of remaining elements is different from how these elements appeared in the initial list. How to do this operation and remain the order from original list?

So the result for l1 should be:

List(1, 2, 3, 1, 3, 2, 5)
like image 946
rtruszk Avatar asked Dec 12 '14 23:12

rtruszk


People also ask

Does remove duplicates keep the first instance?

When duplicates are removed, the first occurrence of the value in the list is kept, but other identical values are deleted. Because you are permanently deleting data, it's a good idea to copy the original range of cells or table to another worksheet or workbook before removing duplicate values.

How to remove duplicates from a list in Java?

1. Remove Duplicates from a List Using Plain Java. Removing the duplicate elements from a List with the standard Java Collections Framework is done easily through a Set: As you can see, the original list remains unchanged.

How to remove duplicates from drop down list in Excel?

We will create a drop-down list and remove the duplicates from that drop-down list by using keyboard shortcuts, Data Validation Command, Pivot Table and combine the SORT, FILTER, and UNIQUE functions in Excel. Here’s an overview of the dataset for today’s task. 1. Apply the Keyboard Shortcuts to Remove Duplicates from Drop Down List in Excel

What happens when you clean up a list without duplicates?

As we can see, the original list remains unchanged. In the example above, we used HashSet implementation, which is an unordered collection. As a result, the order of the cleaned-up listWithoutDuplicates might be different than the order of the original listWithDuplicates. If we need to preserve the order, we can use LinkedHashSet instead: 3.

How to remove duplicates based on criteria using advanced filter?

Let’s learn this method! To remove duplicates based on criteria using the advanced filter, select the whole dataset, go to the Data tab, then in the Sort & Filte r group, click on Advanced. In the Advanced Filter window, check on Filter the List, in-Place to filter the dataset in its current location.


2 Answers

Not the most efficient.

scala> val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
l1: List[Int] = List(1, 2, 3, 1, 1, 3, 2, 5, 1)

scala> l1.zipWithIndex.groupBy( _._1 ).map(_._2.take(2)).flatten.toList.sortBy(_._2).unzip._1
res10: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
like image 123
Soumya Simanta Avatar answered Sep 27 '22 19:09

Soumya Simanta


My humble answer:

def distinctOrder[A](x:List[A]):List[A] = {
    @scala.annotation.tailrec
    def distinctOrderRec(list: List[A], covered: List[A]): List[A] = {
       (list, covered) match {
         case (Nil, _) => covered.reverse
         case (lst, c) if c.count(_ == lst.head) >= 2 => distinctOrderRec(list.tail, covered)
         case _ =>  distinctOrderRec(list.tail, list.head :: covered)
       }
    }
    distinctOrderRec(x, Nil)
}

With the results:

scala> val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
l1: List[Int] = List(1, 2, 3, 1, 1, 3, 2, 5, 1)

scala> distinctOrder(l1)
res1: List[Int] = List(1, 2, 3, 1, 3, 2, 5)

On Edit: Right before I went to bed I came up with this!

l1.foldLeft(List[Int]())((total, next) => if (total.count(_ == next) >= 2) total else total :+ next)

With an answer of:

res9: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
like image 36
Daniel Hinojosa Avatar answered Sep 27 '22 18:09

Daniel Hinojosa