Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find unique elements from list of tuples based on some elements using scala?

Tags:

scala

I have following list

val a = List(("name1","add1","city1",10),("name1","add1","city1",10),

("name2","add2","city2",10),("name2","add2","city2",20),("name3","add3","city3",20))

I want distinct element from above list based on first three values of tuple. Fourth value should not be consider while finding distinct elements from list.

I want following output:

val output = List(("name1","add1","city1",10),("name2","add2","city2",10),

("name3","add3","city3",20))

Is it possible to get above output?

As per my knowledge, distinct works if whole tuple/value is duplicated. I tried out with distinct like following:

val b = List(("name1","add1","city1",10),("name1","add1","city1",10),("name2","add2","city2",10),
("name2","add2","city2",20),("name3","add3","city3",20)).distinct

but it gives output as -

List(("name1","add1","city1",10),("name2","add2","city2",10),
("name2","add2","city2",20),("name3","add3","city3",20))

Any alternate approach will also appreciated.

like image 355
Vishwas Avatar asked Nov 27 '15 11:11

Vishwas


People also ask

How do you find unique values in Scala?

Scala List distinct() method with example. The distinct() method is utilized to delete the duplicate elements from the stated list. Return Type: It returns a new list of elements without any duplicates.

Can we have variables of different types inside of a tuple in Scala?

Scala tuple combines a fixed number of items together so that they can be passed around as a whole. Unlike an array or list, a tuple can hold objects with different types but they are also immutable. The following is an example of a tuple holding an integer, a string, and the console.

What is the difference between list and tuple in Scala?

One of the most important differences between a list and a tuple is that list is mutable, whereas a tuple is immutable.

How is tuple distinct the list?

The main difference between tuples and lists is that lists are mutable while the tuples are immutable means after declaring a tuple we can't modify it or change it while we can modify the list after it's the declaration that's the main difference between list and tuples.


2 Answers

Use groupBy like this

a.groupBy( v => (v._1,v._2,v._3)).keys.toList

This constructs a Map where each key is by definition a unique triplet as required in the lambda function above.

Should it include also the last element in the tuple, fetch the first element for each key, like this

a.groupBy( v => (v._1,v._2,v._3)).mapValues(_.head)
like image 187
elm Avatar answered Sep 28 '22 06:09

elm


If the order of the output list isn't important (i.e. you are happy to get List(("name3","add3","city3",20),("name1","add1","city1",10),("name2","add2","city2",10))), the following works as specified:

a.groupBy(v => (v._1,v._2,v._3)).values.map(_.head).toList

(Due to Scala collections design, you'll see the order kept for output lists up to 4 elements, but above that size HashMap will be used.) If you do need to keep the order, you can do something like (generalizing a bit)

def distinctBy[A, B](xs: Seq[A], f: A => B) = {
  val seen = LinkedHashMap.empty[B, A]
  xs.foreach { x =>
    val key = f(x)
    if (!seen.contains(key)) { seen.update(key, x) }
  }
  seen.values.toList
}

distinctBy(a, v => (v._1, v._2, v._3))
like image 45
Alexey Romanov Avatar answered Sep 28 '22 07:09

Alexey Romanov