Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB MapReduce: Global variables within map function instance?

I've written a MapReduce in MongoDB and would like to use a global variable as a cache to write to/read from. I know it is not possible to have global variables across map function instances - I just want a global variable within each function instance. This type of functionality exists in Hadoop's MapReduce so I was expecting it to be there in MongoDB. But the following does not seem to work:

var cache = {}; // Does not seem to work!
function () {
  var hashValue = this.varValue1 + this.varValue2;
  if(typeof(cache[hashValue])!= 'undefined') {
    // Do nothing, we've processed at least one input record with this hash
  } else {
    // Process the input record
    // Cache the record
    cache[hashValue] = '1';
  }
}

Is this not allowed in MongoDB's MapReduce implementation, or am I doing something wrong in JavaScript (not experienced in JS)?

like image 318
Lucas Zamboulis Avatar asked Jun 08 '10 09:06

Lucas Zamboulis


2 Answers

Looking at the docs, I'm finding the following:

db.runCommand(
 { mapreduce : <collection>,
   map : <mapfunction>,
   reduce : <reducefunction>
   [, scope : <object where fields go into javascript global scope >]
 }
);

I think that "scope" variable is what you need.

There's a test / example on Github that uses the "scope" variable.

I'm still new to this stuff, but hopefully that's enough to get you started.

like image 179
Gates VP Avatar answered Oct 02 '22 21:10

Gates VP


As Gates VP said, you need to add cache into global scope. So, to provide complete answer, considering your script, this is what you'll need to do:

db.runCommand(
 { mapreduce : <your collection>,
   map : <your map function, or reference to it>,
   reduce : <your reduce function, or reference to it>,
   scope : { cache : {} }
 }
);

The command will inject contents of the 'scope' object parameter into your global context. The caching then will work per how you are using it in your map function. I've tested this.

like image 40
Pawel Veselov Avatar answered Oct 02 '22 20:10

Pawel Veselov