Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB: What's the point of using MapReduce without parallelism?

Quoting http://www.mongodb.org/display/DOCS/MapReduce#MapReduce-Parallelism

As of right now, MapReduce jobs on a single mongod process are single threaded. This is due to a design limitation in current JavaScript engines. We are looking into alternatives to solve this issue, but for now if you want to parallelize your MapReduce jobs, you will need to either use sharding or do the aggregation client-side in your code.

Without parallelism, what are the benefits of MapReduce compared to simpler or more traditional methods for queries and data aggregation?

To avoid confusion: the question is NOT "what are the benefits of document-oriented DB over traditional relational DB"

like image 802
netvope Avatar asked May 08 '10 14:05

netvope


People also ask

How does MapReduce work in MongoDB?

Below is the working of the MapReduce command in MongoDB. It states that data processing techniques for a large volume of data. It contains two functions of javascript are as follows. Map: It is a javascript function that was used in a MapReduce Command of MongoDB. It will associate that maps a value with key and emits the pair value and key.

How do I Map Reduce a collection in MongoDB?

The mapReduce command allows you to run map-reduce aggregation operations over a collection. Starting in version 4.4, MongoDB ignores the verbose option. The map-reduce option to create a new sharded collection as well as the use of the sharded option for map-reduce. To output to a sharded collection, create the sharded collection first.

How does MongoDB map work with JavaScript?

Internally, MongoDB converts the JavaScript objects emitted by the map function to BSON objects. These BSON objects are then converted back to JavaScript objects when calling the reduce function. The map-reduce operation places the intermediate BSON objects in temporary, on-disk storage.

What is the use of finalize method in MongoDB?

Finalize: It is an optional parameter method in MongoDB. It will modify the output and follows the reduce method. Scope: Scope is used to specify that global variables that were accessible from the map using the MapReduce method. JsMode: It will specify whether the data will convert into BSON format at the time execution of functions.


1 Answers

The main reason to use MapReduce over simpler or more traditional queries is that it simply can do things (i.e., aggregation) that simple queries cannot.

Once you need aggregation, there are two options using MongoDB: MapReduce and the group command. The group command is analogous to SQL's "group by" and is limited in that it has to return all its results in a single database response. That means group can only be used when you have less than 4MB of results. MapReduce, on the other hand, can do anything a "group by" can, but outputs results to a new collection so results can be as large as needed.

Also, parallelism is coming, so it's good to have some practice :)

like image 57
kristina Avatar answered Oct 31 '22 19:10

kristina