Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scala: Why does SortedMap's mapValues returns a Map and not a SortedMap?

I'm new to Scala. I'm using SortedMap in my code, and I wanted to use mapValues to create a new map with some transformation on the values.

Instead of returning a new SortedMap, the mapValues function returns a new Map, which I then have to convert to a SortedMap.

For example

val my_map = SortedMap(1 -> "one", 0 -> "zero", 2 -> "two")
val new_map = my_map.mapValues(name => name.toUpperCase)
// returns scala.collection.immutable.Map[Int,java.lang.String] = Map(0 -> ZERO, 1 -> ONE, 2 -> TWO)
val sorted_new_map = SortedMap(new_map.toArray:_ *)

This looks inefficient - the last convertion probably sorts the keys again, or at least verify that they are sorted.

I could use the normal map function which operates both on the keys and the values, and deliberately not change the keys in my transformation function. This looks inefficient too, since the implementation of Map probably assume that the transformation may change the order of the keys (like in the case: my_map.map(tup => (-tup._1, tup._2)) - so it probably "re-sorts" them too.

Is anyone familiar with the internal implementations of Map and SortedMap, and could tell me if my assumptions are correct? Can the compiler recognize automatically that the keys have not been reordered? Is there an internal reason for why mapValues should not return a SortedMap? Is there a better way to transform the map's values without loosing the order of the keys?

Thanks

like image 326
Oren Avatar asked Sep 26 '12 21:09

Oren


1 Answers

You've stumbled upon a tricky feature of Scala's Map implementation. The catch that you are missing is that mapValues does not actually return a new Map: it returns a view of a Map. In other words, it wraps your original map in such a way that whenever you access a value it will compute .toUpperCase before returning the value to you.

The upside to this behavior is that Scala won't compute the function for values that aren't accessed, and it won't spend time copying all the data into a new Map. The downside is that the function is re-computed every time that value is accessed. So you might end up doing extra computation if you access the same values many times.

So why does SortedMap not return a SortedMap? Because it's actually returning a Map-wrapper. The underlying Map, then one that is wrapped, is still a SortedMap, so if you were to iterate through, it would still be in sorted order. You and I know that, but the type-checker doesn't. It certainly seems like they could have written it in such a way that it still maintains the SortedMap trait, but they didn't.

You can see in the code that it's not returning a SortedMap, but that the iteration behavior is still going to be sorted:

// from MapLike
override def mapValues[C](f: B => C): Map[A, C] = new DefaultMap[A, C] {
  def iterator = for ((k, v) <- self.iterator) yield (k, f(v))
  ...

The solution to your problem is the same as the solution to getting around the view issue: use .map{ case (k,v) => (k,f(v)) }, as you mentioned in your question.


If you really want that convenience method though, you can do what I do, and write you own, better, version of mapValues:

class EnrichedWithMapVals[T, U, Repr <: GenTraversable[(T, U)]](self: GenTraversableLike[(T, U), Repr]) {
  /**
   * In a collection of pairs, map a function over the second item of each
   * pair.  Ensures that the map is computed at call-time, and not returned
   * as a view as 'Map.mapValues' would do.
   *
   * @param f   function to map over the second item of each pair
   * @return a collection of pairs
   */
  def mapVals[R, That](f: U => R)(implicit bf: CanBuildFrom[Repr, (T, R), That]) = {
    val b = bf(self.asInstanceOf[Repr])
    b.sizeHint(self.size)
    for ((k, v) <- self) b += k -> f(v)
    b.result
  }
}
implicit def enrichWithMapVals[T, U, Repr <: GenTraversable[(T, U)]](self: GenTraversableLike[(T, U), Repr]): EnrichedWithMapVals[T, U, Repr] =
  new EnrichedWithMapVals(self)

Now when you call mapVals on a SortedMap you get back a non-view SortedMap:

scala> val m3 = m1.mapVals(_ + 1)
m3: SortedMap[String,Int] = Map(aardvark -> 2, cow -> 6, dog -> 10)

It actually works on any collection of pairs, not just Map implementations:

scala> List(('a,1),('b,2),('c,3)).mapVals(_+1)
res8: List[(Symbol, Int)] = List(('a,2), ('b,3), ('c,4))
like image 98
dhg Avatar answered Oct 02 '22 07:10

dhg