While looking at documentation for map-reduce, I found that:
NOTE:
For most aggregation operations, the Aggregation Pipeline provides better performance and more coherent interface. However, map-reduce operations provide some flexibility that is not presently available in the aggregation pipeline.
I did not understand much from it.
Map-reduce is a common pattern when working with Big Data – it's a way to extract info from a huge dataset. But now, starting with version 2.2, MongoDB includes a new feature called Aggregation framework. Functionality-wise, Aggregation is equivalent to map-reduce but, on paper, it promises to be much faster.
In MongoDB, map-reduce is a data processing programming model that helps to perform operations on large data sets and produce aggregated results. MongoDB provides the mapReduce() function to perform the map-reduce operations. This function has two main functions, i.e., map function and reduce function.
Map-reduce operations can be rewritten using aggregation pipeline operators, such as $group , $merge , and others. For map-reduce operations that require custom functionality, MongoDB provides the $accumulator and $function aggregation operators starting in version 4.4.
The pipeline provides efficient data aggregation using native operations within MongoDB, and is the preferred method for data aggregation in MongoDB. The aggregation pipeline can operate on a sharded collection.
For one thing, Map/Reduce in MongoDB wasn't made for ad-hoc queries, there's considerable overhead to M/R. Even a very simple M/R operation on a small dataset can take in the hundreds of milliseconds because of that overhead.
I can't say much about the performance of M/R compared to the aggregation framework on large datasets in practice, but in theory, M/R operations on a large sharded database should be faster since the shards can run the operations largely in parallel.
As to the flexibility, since M/R actually runs javascript methods you have the full power of the language at your disposal. For example, let's say you wanted to group some data by the cosine of a field's value. Since there's neither a $cos
operator in the aggregation framework, nor a meaningful way to build discrete buckets from continuous numbers (something like $truncate
), the aggregation framework wouldn't help in that case.
So, in a nutshell, I'd say the use cases are
out
parameter and merging the results)If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With