Usually I call distinct on List to remove duplicates or turn it into a Set
. Now I have a List[MyObject]
. MyObject
is a case class, see below:
case class MyObject(s1: String, s2:String, s3:String)
Let's say we have the following cases:
val myObj1 = MyObject("", "gmail,com", "some text")
val myObj2 = MyObject("", "gmail,com", "")
val myObj3 = MyObject("some text", "gmail.com", "")
val myObj4 = MyObject("some text", "gmail.com", "some text")
val myObj5 = MyObject("", "ymail.com", "")
val myObj6 = MyObject("", "ymail.com", "some text")
val myList = List(myObj1, myObj2, myObj3, myObj4, myObj5, myObj6)
Two Questions:
s2
?s2
? I would consider two case objects the same when s2 == s2
. Do I need to turn the case class into a normal class and override equals? Do I need a my own Comparator for this or can I use some Scala API method to archive the same?How can I count how many objects are affected? Duplicates based on the content of s2?
If you want to count how many objects are in each duplicate group (if you only want to know how many objects are going to be removed, subtract 1 from size):
myList.groupBy(_.s2).map(x => (x._1, x._2.size))
res0: scala.collection.immutable.Map[String,Int] = Map(ymail.com -> 2, gmail.com -> 2, gmail,com -> 2)
How can I make the List distinct based on s2?
myList.groupBy(_.s2).map(_._2.head)
res1: scala.collection.immutable.Iterable[MyObject] = List(MyObject(,ymail.com,), MyObject(some text,gmail.com,), MyObject(,gmail,com,some text))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With