Lets assume we have a Scala list:
val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
We can easily remove duplicates using the following code:
l1.distinct
or
l1.toSet.toList
But what if we want to remove duplicates only if there are more than 2 of them? So if there are more than 2 elements with the same value we remain only two and remove the rest of them.
I could achieve it with following code:
l1.groupBy(identity).mapValues(_.take(2)).values.toList.flatten
that gave me the result:
List(2, 2, 5, 1, 1, 3, 3)
Elements are removed but the order of remaining elements is different from how these elements appeared in the initial list. How to do this operation and remain the order from original list?
So the result for l1 should be:
List(1, 2, 3, 1, 3, 2, 5)
When duplicates are removed, the first occurrence of the value in the list is kept, but other identical values are deleted. Because you are permanently deleting data, it's a good idea to copy the original range of cells or table to another worksheet or workbook before removing duplicate values.
1. Remove Duplicates from a List Using Plain Java. Removing the duplicate elements from a List with the standard Java Collections Framework is done easily through a Set: As you can see, the original list remains unchanged.
We will create a drop-down list and remove the duplicates from that drop-down list by using keyboard shortcuts, Data Validation Command, Pivot Table and combine the SORT, FILTER, and UNIQUE functions in Excel. Here’s an overview of the dataset for today’s task. 1. Apply the Keyboard Shortcuts to Remove Duplicates from Drop Down List in Excel
As we can see, the original list remains unchanged. In the example above, we used HashSet implementation, which is an unordered collection. As a result, the order of the cleaned-up listWithoutDuplicates might be different than the order of the original listWithDuplicates. If we need to preserve the order, we can use LinkedHashSet instead: 3.
Let’s learn this method! To remove duplicates based on criteria using the advanced filter, select the whole dataset, go to the Data tab, then in the Sort & Filte r group, click on Advanced. In the Advanced Filter window, check on Filter the List, in-Place to filter the dataset in its current location.
Not the most efficient.
scala> val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
l1: List[Int] = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
scala> l1.zipWithIndex.groupBy( _._1 ).map(_._2.take(2)).flatten.toList.sortBy(_._2).unzip._1
res10: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
My humble answer:
def distinctOrder[A](x:List[A]):List[A] = {
@scala.annotation.tailrec
def distinctOrderRec(list: List[A], covered: List[A]): List[A] = {
(list, covered) match {
case (Nil, _) => covered.reverse
case (lst, c) if c.count(_ == lst.head) >= 2 => distinctOrderRec(list.tail, covered)
case _ => distinctOrderRec(list.tail, list.head :: covered)
}
}
distinctOrderRec(x, Nil)
}
With the results:
scala> val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
l1: List[Int] = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
scala> distinctOrder(l1)
res1: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
On Edit: Right before I went to bed I came up with this!
l1.foldLeft(List[Int]())((total, next) => if (total.count(_ == next) >= 2) total else total :+ next)
With an answer of:
res9: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With