Can anyone please explain the concept of map-reduce, particularly in Mongo?
I also use C# so any specifics in that area would also be useful.
One way to understand Map-Reduce coming from C# and LINQ is to think of it as a SelectMany()
followed by a GroupBy()
followed by an Aggregate()
operation.
In a SelectMany()
you are projecting a sequence but each element can become multiple elements. This is equivalent to using multiple emit
statements in your map operation. The map operation can also chose not to call emit which is like having a Where()
clause inside your SelectMany()
operation.
In a GroupBy()
you are collecting elements with the same key which is what Map-Reduce does with the key value that you emit from the map operation.
In the Aggregate()
or reduce step you are taking the collections associated with each group key and combining them in some way to produce one result for each key. Often this combination is simply adding up a single '1' value output with each key from the map step but sometimes it's more complicated.
One important caveat with MongoDB's map-reduce is that the reduce operation must accept and output the same data type because it may be applied repeatedly to partial sets of the grouped data. If you are passed an array of values, don't simply take the length of it because it might be a partial result from an earlier reduce operation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With