Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cleaner tuple groupBy

I have a sequence of key-value pairs (String, Int), and I want to group them by key into a sequence of values (i.e. Seq[(String, Int)]) => Map[String, Iterable[Int]])).

Obviously, toMap isn't useful here, and groupBy maintains the values as tuples. The best I managed to come up with is:

val seq: Seq[( String, Int )]
// ...
seq.groupBy( _._1 ).mapValues( _.map( _._2 ) )

Is there a cleaner way of doing this?

like image 587
Tomer Gabel Avatar asked May 28 '12 11:05

Tomer Gabel


2 Answers

Here's a pimp that adds a toMultiMap method to traversables. Would it solve your problem?

import collection._
import mutable.Builder
import generic.CanBuildFrom

class TraversableOnceExt[CC, A](coll: CC, asTraversable: CC => TraversableOnce[A]) {

  def toMultiMap[T, U, That](implicit ev: A <:< (T, U), cbf: CanBuildFrom[CC, U, That]): immutable.Map[T, That] =
    toMultiMapBy(ev)

  def toMultiMapBy[T, U, That](f: A => (T, U))(implicit cbf: CanBuildFrom[CC, U, That]): immutable.Map[T, That] = {
    val mutMap = mutable.Map.empty[T, mutable.Builder[U, That]]
    for (x <- asTraversable(coll)) {
      val (key, value) = f(x)
      val builder = mutMap.getOrElseUpdate(key, cbf(coll))
      builder += value
    }
    val mapBuilder = immutable.Map.newBuilder[T, That]
    for ((k, v) <- mutMap)
      mapBuilder += ((k, v.result))
    mapBuilder.result
  }
}

implicit def commomExtendTraversable[A, C[A] <: TraversableOnce[A]](coll: C[A]): TraversableOnceExt[C[A], A] =
  new TraversableOnceExt[C[A], A](coll, identity)

Which can be used like this:

val map = List(1 -> 'a', 1 -> 'à', 2 -> 'b').toMultiMap
println(map)  // Map(1 -> List(a, à), 2 -> List(b))

val byFirstLetter = Set("abc", "aeiou", "cdef").toMultiMapBy(elem => (elem.head, elem))
println(byFirstLetter) // Map(c -> Set(cdef), a -> Set(abc, aeiou))

If you add the following implicit defs, it will also work with collection-like objects such as Strings and Arrays:

implicit def commomExtendStringTraversable(string: String): TraversableOnceExt[String, Char] =
  new TraversableOnceExt[String, Char](string, implicitly)

implicit def commomExtendArrayTraversable[A](array: Array[A]): TraversableOnceExt[Array[A], A] =
  new TraversableOnceExt[Array[A], A](array, implicitly)

Then:

val withArrays = Array(1 -> 'a', 1 -> 'à', 2 -> 'b').toMultiMap
println(withArrays) // Map(1 -> [C@377653ae, 2 -> [C@396fe0f4)

val byLowercaseCode = "Mama".toMultiMapBy(c => (c.toLower.toInt, c))
println(byLowercaseCode) // Map(97 -> aa, 109 -> Mm)
like image 98
Jean-Philippe Pellet Avatar answered Oct 19 '22 02:10

Jean-Philippe Pellet


There's no method or data structure in the standard library to do this, and your solution looks about as concise as you'll get. If you use this in more than one place, you might like to factor it out into a utility method

def groupTuples[A, B](seq: Seq[(A, B)]) = 
  seq groupBy (_._1) mapValues (_ map (_._2))

which you then obviously just call with groupTuples(seq). This might not be the most efficient possible in terms of CPU clock cycles, but I don't think it's particularly inefficient either.

I did a rough benchmark against Jean-Philippe's solution on a list of 9 tuples and this is marginally faster. Both were about twice as fast as folding the sequence into a map (effectively re-implementing groupBy to give the output you want).

like image 12
Luigi Plinge Avatar answered Oct 19 '22 00:10

Luigi Plinge