For any given collection of <code>Map</code>, for instance, <pre class="prettyprint"><code>val in = Array( Map("a" -> 1, "b" -> 2), Map("a" -> 11, "c" -> 4), Map("b" -> 7, "c" -> 10)) </code></pre> how to use <code>aggregate</code> on <code>in.par</code> so as to merge the maps into <pre class="prettyprint"><code>Map ( "a" -> 12, "b" -> 9, "c" -> 14 ) </code></pre> Note <code>Map</code> merging has been asked multiple times, yet looking for a solution with <code>aggregate</code> on parallel collections. Many Thanks

How about applying merge as both <code>seqop</code> and <code>comboop</code>? <pre class="prettyprint"><code>val in = Array( Map("a" -> 1, "b" -> 2), Map("a" -> 11, "c" -> 4), Map("b" -> 7, "c" -> 10) ) def merge(m1: Map[String, Int], m2: Map[String, Int]): Map[String, Int] = m1 ++ m2.map { case (k, v) => k -> (v + m1.getOrElse(k, 0)) } in.par.aggregate(Map[String, Int]())(merge, merge) </code></pre> Update You pass to <code>aggregate</code> initial accumulator value(empty map) and two closures - <code>seqop</code> and <code>comboop</code>. Parallel sequence splits in several partitions to be processed in parallel. Each partition is processed by successively applying <code>seqop</code> to accumulator and array element. <pre class="prettyprint"><code>def seqop( accumulator: Map[String, Int], element: Map[String, Int]): Map[String, Int] = merge(accumulator, element) </code></pre> <code>seqop</code> takes initial accumulator value and first array element and merges it. Next it takes previous result and next array element and so on until whole partition is merged in one map. When every partition is merged in a separate map, these maps should be combined by applying <code>comboop</code>. <code>comboop</code> takes merged map from first partition and merged map from second partition and merges it together. Next it takes previous result and map from third partition and so on until all is merged in one map. This is the result of <code>aggregate</code>. <pre class="prettyprint"><code>def comboop( m1: Map[String, Int], m2: Map[String, Int]): Map[String, Int] = merge(m1, m2) </code></pre> It is just coincidence that <code>seqop</code> and <code>comboop</code> are the same. In general they differs in logic and signatures.

Merging Maps using `aggregate`

Tags:

parallel-processing

aggregate

map

scala

scala-collections

For any given collection of Map, for instance,

val in = Array( Map("a" -> 1,  "b" -> 2),
                Map("a" -> 11, "c" -> 4),
                Map("b" -> 7,  "c" -> 10))

how to use aggregate on in.par so as to merge the maps into

Map ( "a" -> 12, "b" -> 9, "c" -> 14 )

Note Map merging has been asked multiple times, yet looking for a solution with aggregate on parallel collections.

Many Thanks

991

asked Aug 20 '14 08:08

elm

1 Answers

How about applying merge as both seqop and comboop?

val in = Array(
  Map("a" -> 1,  "b" -> 2),
  Map("a" -> 11, "c" -> 4),
  Map("b" -> 7,  "c" -> 10)
)

def merge(m1: Map[String, Int], m2: Map[String, Int]): Map[String, Int] =
  m1 ++ m2.map { case (k, v) => k -> (v + m1.getOrElse(k, 0)) }

in.par.aggregate(Map[String, Int]())(merge, merge)

Update

You pass to aggregate initial accumulator value(empty map) and two closures - seqop and comboop.

Parallel sequence splits in several partitions to be processed in parallel. Each partition is processed by successively applying seqop to accumulator and array element.

def seqop(
    accumulator: Map[String, Int], 
    element: Map[String, Int]): Map[String, Int] = merge(accumulator, element)

seqop takes initial accumulator value and first array element and merges it. Next it takes previous result and next array element and so on until whole partition is merged in one map.

When every partition is merged in a separate map, these maps should be combined by applying comboop. comboop takes merged map from first partition and merged map from second partition and merges it together. Next it takes previous result and map from third partition and so on until all is merged in one map. This is the result of aggregate.

def comboop(
    m1: Map[String, Int], 
    m2: Map[String, Int]): Map[String, Int] = merge(m1, m2)

It is just coincidence that seqop and comboop are the same. In general they differs in logic and signatures.

answered Sep 21 '22 18:09

lambdas

Related questions
                            
                                Scala method to pretty print XML directly to a java.io.Writer (not a string)?
                            
                                How do I set up a multi-stage test pipeline in sbt?
                            
                                Working IntelliJ Scala Plugin Tutorial?
                            
                                What are good combinations of Scala + Web framework + Javascript framework? [closed]
                            
                                Scala Parsing RSS/Atom feeds [closed]
                            
                                Scala pattern matching with manifest
                            
                                How to call a template which accepts variable number of args in Play Framework 2
                            
                                Indexable data structures behind Scala's for comprehension
                            
                                Create common trait for all case classes supporting copy(id=newId) method
                            
                                Separate jars for different main classes under same src
                            
                                How to force Play 2.1 to compile against other Java version
                            
                                Imports and wildcard imports of Symbols in Scala
                            
                                Parse Java source with Scala
                            
                                example uses scalaz.Lens's modf, modp and xmap
                            
                                Intellij multi-module maven project, update in one of the submodules is not propagated to war
                            
                                How do I return an auto generated ID using Slick plain SQL with SQL Server
                            
                                apache spark: local[K] master URL - job gets stuck
                            
                                DeDuplication error with SBT assembly plugin
                            
                                InvalidRequestException(why:empid cannot be restricted by more than one relation if it includes an Equal)
                            
                                how to print variable name and value using a scala macro?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With