Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to merge two maps and sum the values of same key?

Tags:

merge

map

scala

val map1 = Map(1 -> 9 , 2 -> 20)
val map2 = Map(1 -> 100, 3 -> 300)

I want to merge them, and sum the values of same keys. So the result will be:

Map(2->20, 1->109, 3->300)

Now I have 2 solutions:

val list = map1.toList ++ map2.toList
val merged = list.groupBy ( _._1) .map { case (k,v) => k -> v.map(_._2).sum }

and

val merged = (map1 /: map2) { case (map, (k,v)) =>
    map + ( k -> (v + map.getOrElse(k, 0)) )
}

But I want to know if there are any better solutions.

like image 634
Freewind Avatar asked Aug 16 '11 09:08

Freewind


People also ask

How do I merge two maps in groovy?

The easiest way to merge two maps in Groovy is to use + operator. This method is straightforward - it creates a new map from the left-hand-side and right-hand-side maps.

Can map have duplicate keys in Scala?

Scala Map is a collection of Key-value pair. A map cannot have duplicate keys but different keys can have same values i.e keys are unique whereas values can be duplicate.


4 Answers

The shortest answer I know of that uses only the standard library is

map1 ++ map2.map{ case (k,v) => k -> (v + map1.getOrElse(k,0)) }
like image 84
Rex Kerr Avatar answered Oct 21 '22 01:10

Rex Kerr


Scalaz has the concept of a Semigroup which captures what you want to do here, and leads to arguably the shortest/cleanest solution:

scala> import scalaz._
import scalaz._

scala> import Scalaz._
import Scalaz._

scala> val map1 = Map(1 -> 9 , 2 -> 20)
map1: scala.collection.immutable.Map[Int,Int] = Map(1 -> 9, 2 -> 20)

scala> val map2 = Map(1 -> 100, 3 -> 300)
map2: scala.collection.immutable.Map[Int,Int] = Map(1 -> 100, 3 -> 300)

scala> map1 |+| map2
res2: scala.collection.immutable.Map[Int,Int] = Map(1 -> 109, 3 -> 300, 2 -> 20)

Specifically, the binary operator for Map[K, V] combines the keys of the maps, folding V's semigroup operator over any duplicate values. The standard semigroup for Int uses the addition operator, so you get the sum of values for each duplicate key.

Edit: A little more detail, as per user482745's request.

Mathematically a semigroup is just a set of values, together with an operator that takes two values from that set, and produces another value from that set. So integers under addition are a semigroup, for example - the + operator combines two ints to make another int.

You can also define a semigroup over the set of "all maps with a given key type and value type", so long as you can come up with some operation that combines two maps to produce a new one which is somehow the combination of the two inputs.

If there are no keys that appear in both maps, this is trivial. If the same key exists in both maps, then we need to combine the two values that the key maps to. Hmm, haven't we just described an operator which combines two entities of the same type? This is why in Scalaz a semigroup for Map[K, V] exists if and only if a Semigroup for V exists - V's semigroup is used to combine the values from two maps which are assigned to the same key.

So because Int is the value type here, the "collision" on the 1 key is resolved by integer addition of the two mapped values (as that's what Int's semigroup operator does), hence 100 + 9. If the values had been Strings, a collision would have resulted in string concatenation of the two mapped values (again, because that's what the semigroup operator for String does).

(And interestingly, because string concatenation is not commutative - that is, "a" + "b" != "b" + "a" - the resulting semigroup operation isn't either. So map1 |+| map2 is different from map2 |+| map1 in the String case, but not in the Int case.)

like image 143
Andrzej Doyle Avatar answered Oct 21 '22 03:10

Andrzej Doyle


Quick solution:

(map1.keySet ++ map2.keySet).map {i=> (i,map1.getOrElse(i,0) + map2.getOrElse(i,0))}.toMap
like image 48
Matthew Farwell Avatar answered Oct 21 '22 01:10

Matthew Farwell


Well, now in scala library (at least in 2.10) there is something you wanted - merged function. BUT it's presented only in HashMap not in Map. It's somewhat confusing. Also the signature is cumbersome - can't imagine why I'd need a key twice and when I'd need to produce a pair with another key. But nevertheless, it works and much cleaner than previous "native" solutions.

val map1 = collection.immutable.HashMap(1 -> 11 , 2 -> 12)
val map2 = collection.immutable.HashMap(1 -> 11 , 2 -> 12)
map1.merged(map2)({ case ((k,v1),(_,v2)) => (k,v1+v2) })

Also in scaladoc mentioned that

The merged method is on average more performant than doing a traversal and reconstructing a new immutable hash map from scratch, or ++.

like image 41
Mikhail Golubtsov Avatar answered Oct 21 '22 03:10

Mikhail Golubtsov