Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mongo DB Map/Reduce - Reduce doesnt get called

I'm trying to do a simple map reduce in the Mongo shell, but the reduce function never gets called. This is my code :

db.sellers.mapReduce( 
    function(){ emit( this._id, 'Map') } , 
    function(k,vs){ return 'Reduce' }, 
    { out: { inline: 1}})

And the result is

{
"results" : [
    {
        "_id" : ObjectId("4da0bdb56bd728c276911e1a"),
        "value" : "Map"
    },
    {
        "_id" : ObjectId("4da0df9a6bd728c276911e1b"),
        "value" : "Map"
    }
],
"timeMillis" : 0,
"counts" : {
    "input" : 2,
    "emit" : 2,
    "output" : 2
},
"ok" : 1,

}

Whats wrong?

I'm using MongoDB 1.8.1 32 bit on Ubuntu 10.10

like image 352
Adil Avatar asked Apr 10 '11 12:04

Adil


3 Answers

The purpose of reduce is to, ekhem, reduce the set of values associated with a given key into a one value (aggregate results). If you emit only one value for each MapReduce key, there is not need for reduce, all the work is done. But if you emit two pairs for a given _id, reduce will be called:

emit(this._id, 'Map1');
emit(this._id, 'Map2');

this will call reduce with the following parameters:

reduce(_id, ['Map1', 'Map2'])

More likely you will want to use _id for MapReduce key when filtering dataset: emit only when given record fulfills some condition. But again, reduce won't be called in this case, which is expected.

like image 175
Tomasz Nurkiewicz Avatar answered Nov 02 '22 14:11

Tomasz Nurkiewicz


Well, the MongoDB does not call Reduce function on a key if there is only one value for it.

In my opinion, this is bad. It should be left to my reducer code to decide whether to skip a singular value or do some operation on it.

Now, if I have to do some operation on singular value, I end up writing the finalize function and in the finalize, I try to differentiate which value has gone through the reducer or which not.

I am very sure, it does not happen this way in case of Hadoop.

like image 34
Sandeep Giri Avatar answered Nov 02 '22 13:11

Sandeep Giri


Map reduce will collect values with a common key into a single value.

In this case nothing is to be done because each value emitted by map has a different key. No reduction is needed.

db.sellers.mapReduce( 
    function(){ emit( this._id, 'Map') } , 
    function(k,vs){ return 'Reduce' }, 
    { out: { inline: 1}})

This is not entirely clear from reading the documentation.

If you wanted to call reduce, you might hardcode an ID like this:

db.sellers.mapReduce( 
    function(){ emit( 1, 'Map') } , 
    function(k,vs){ return 'Reduce' }, 
    { out: { inline: 1}})

Now all the values emitted by map will be reduced until only one remains.

like image 1
superluminary Avatar answered Nov 02 '22 13:11

superluminary