Is there a way to use a user-defined function saved as <code>db.system.js.save(...)</code> in pipeline or mapreduce?

Any function you save to <code>system.js</code> is available for usage by "JavaScript" processing statements such as the <code>$where</code> operator and <code>mapReduce</code> and can be referenced by the <code>_id</code> value is was asssigned. <pre class="prettyprint lang-js prettyprint-override"><code>db.system.js.save({ "_id": "squareThis", "value": function(a) { return a*a } }) </code></pre> And some data inserted to "sample" collection: <pre class="prettyprint lang-js prettyprint-override"><code>{ "_id" : ObjectId("55aafd2bacbed38e06f9eccf"), "a" : 1 } { "_id" : ObjectId("55aafea6acbed38e06f9ecd0"), "a" : 2 } { "_id" : ObjectId("55aafeabacbed38e06f9ecd1"), "a" : 3 } </code></pre> Then: <pre class="prettyprint lang-js prettyprint-override"><code>db.sample.mapReduce( function() { emit(null, squareThis(this.a)); }, function(key,values) { return Array.sum(values); }, { "out": { "inline": 1 } } ); </code></pre> Gives: <pre class="prettyprint lang-js prettyprint-override"><code> "results" : [ { "_id" : null, "value" : 14 } ], </code></pre> Or with <code>$where</code>: <pre class="prettyprint lang-js prettyprint-override"><code>db.sample.find(function() { return squareThis(this.a) == 9 }) { "_id" : ObjectId("55aafeabacbed38e06f9ecd1"), "a" : 3 } </code></pre> But in "neither" case can you use globals such as the database <code>db</code> reference or other functions. Both <code>$where</code> and <code>mapReduce</code> documentation contain information of the limits of what you can do here. So if you thought you were going to do something like "look up data in another collection", then you can forget it because it is "Not Allowed". Every MongoDB command action is actually a call to a "runCommand" action "under the hood" anyway. But unless what that command is actually doing is "calling a JavaScript processing engine" then the usage becomes irrelevant. There are only a few commands anyway that do this, being <code>mapReduce</code>, <code>group</code> or <code>eval</code>, and of course the find operations with <code>$where</code>. <hr> The aggregation framework does not use JavaScript in any way at all. You might be mistaking just as others have done a statement like this, which does not do what you think it does: <pre class="prettyprint lang-js prettyprint-override"><code>db.sample.aggregate([ { "$match": { "a": { "$in": db.sample.distinct("a") } }} ]) </code></pre> So that is "not running inside" the aggregation pipeline, but rather the "result" of that <code>.distinct()</code> call is "evaluated" before the pipeline is sent to the server. Much as with an external variable is done anyway: <pre class="prettyprint lang-js prettyprint-override"><code>var items = [1,2,3]; db.sample.aggregate([ { "$match": { "a": { "$in": items } }} ]) </code></pre> Both essentially send to the server in the same way: <pre class="prettyprint lang-js prettyprint-override"><code>db.sample.aggregate([ { "$match": { "a": { "$in": [1,2,3] } }} ]) </code></pre> So it is "not possible" to "call" any JavaScript function in the aggregation pipeline, nor is there really any point is "passing in" results in general from something saved in <code>system.js</code>. The "code" needs to be "loaded to the client" and only a JavaScript engine can actually do anything with it. With the aggregation framework, all of the "operators" available are actually natively coded functions as opposed to the "free form" JavaScript interpretation provided for <code>mapReduce</code>. So instead of writing "JavaScript", you use the operators themselves: <pre class="prettyprint lang-js prettyprint-override"><code>db.sample.aggregate([ { "$group": { "_id": null, "sqared": { "$sum": { "$multiply": [ "$a", "$a" ] }} }} ]) { "_id" : null, "sqared" : 14 } </code></pre> So there are limitations on what you can do with functions saved in system.js, and the chances are that what you want to do is either: <ul> <li>Not allowed, such as accessing data from another collection</li> <li>Not really required as the logic is generally self contained anyway</li> <li>Or probably better implemented in client logic or other different form anyway</li> </ul> Just about the only practical use I can really think of is that you have a number of "mapReduce" operations that cannot be done any other way and you have various "shared" functions that you would rather just store on the server than maintain within every mapReduce function call. But then again, the 90% reason for mapReduce over the aggregation framework is usually that the "document structure" of the collections has been poorly chosen and the JavaScript functionality is "required" to traverse the document for search and analysis. So you can use it under the allowed constraints, but in most cases you probably should not be using this at all, but fixing the other issues that caused you to believe you needed this feature in the first place.

Using stored JavaScript functions in the Aggregation pipeline, MapReduce or runCommand

1 Answers

Any function you save to system.js is available for usage by "JavaScript" processing statements such as the $where operator and mapReduce and can be referenced by the _id value is was asssigned.

db.system.js.save({ 
   "_id": "squareThis", 
   "value": function(a) { return a*a } 
})

And some data inserted to "sample" collection:

{ "_id" : ObjectId("55aafd2bacbed38e06f9eccf"), "a" : 1 }
{ "_id" : ObjectId("55aafea6acbed38e06f9ecd0"), "a" : 2 }
{ "_id" : ObjectId("55aafeabacbed38e06f9ecd1"), "a" : 3 }

Then:

db.sample.mapReduce(
    function() {
       emit(null, squareThis(this.a));
    },
    function(key,values) {
        return Array.sum(values);
    },
    { "out": { "inline": 1 } }
 );

Gives:

   "results" : [
            {
                    "_id" : null,
                    "value" : 14
            }
    ],

Or with $where:

db.sample.find(function() { return squareThis(this.a) == 9 })
{ "_id" : ObjectId("55aafeabacbed38e06f9ecd1"), "a" : 3 }

But in "neither" case can you use globals such as the database db reference or other functions. Both $where and mapReduce documentation contain information of the limits of what you can do here. So if you thought you were going to do something like "look up data in another collection", then you can forget it because it is "Not Allowed".

Every MongoDB command action is actually a call to a "runCommand" action "under the hood" anyway. But unless what that command is actually doing is "calling a JavaScript processing engine" then the usage becomes irrelevant. There are only a few commands anyway that do this, being mapReduce, group or eval, and of course the find operations with $where.

The aggregation framework does not use JavaScript in any way at all. You might be mistaking just as others have done a statement like this, which does not do what you think it does:

db.sample.aggregate([
    { "$match": {
        "a": { "$in": db.sample.distinct("a") }
    }}
])

So that is "not running inside" the aggregation pipeline, but rather the "result" of that .distinct() call is "evaluated" before the pipeline is sent to the server. Much as with an external variable is done anyway:

var items = [1,2,3];
db.sample.aggregate([
    { "$match": {
        "a": { "$in": items }
    }}
])

Both essentially send to the server in the same way:

db.sample.aggregate([
    { "$match": {
        "a": { "$in": [1,2,3] }
    }}
])

So it is "not possible" to "call" any JavaScript function in the aggregation pipeline, nor is there really any point is "passing in" results in general from something saved in system.js. The "code" needs to be "loaded to the client" and only a JavaScript engine can actually do anything with it.

With the aggregation framework, all of the "operators" available are actually natively coded functions as opposed to the "free form" JavaScript interpretation provided for mapReduce. So instead of writing "JavaScript", you use the operators themselves:

db.sample.aggregate([
    { "$group": {
        "_id": null,
        "sqared": { "$sum": {
           "$multiply": [ "$a", "$a" ]
        }}
    }}
])

{ "_id" : null, "sqared" : 14 }

So there are limitations on what you can do with functions saved in system.js, and the chances are that what you want to do is either:

Not allowed, such as accessing data from another collection
Not really required as the logic is generally self contained anyway
Or probably better implemented in client logic or other different form anyway

Just about the only practical use I can really think of is that you have a number of "mapReduce" operations that cannot be done any other way and you have various "shared" functions that you would rather just store on the server than maintain within every mapReduce function call.

But then again, the 90% reason for mapReduce over the aggregation framework is usually that the "document structure" of the collections has been poorly chosen and the JavaScript functionality is "required" to traverse the document for search and analysis.

So you can use it under the allowed constraints, but in most cases you probably should not be using this at all, but fixing the other issues that caused you to believe you needed this feature in the first place.

186

answered Nov 08 '22 19:11

Blakes Seven

Related questions
                            
                                "Toggle" query in MongoDB [duplicate]
                            
                                MongoDB had an unspecified uncaught exception
                            
                                Mongodb update. setOnInsert Mod on _id not allowed
                            
                                Query with like in MongoDB
                            
                                Bitwise enum (flags) query using MongoDB's official C# driver
                            
                                MongoDB compound index usage
                            
                                Drop MongoDB database before running Mocha test
                            
                                Is it possible to add multiple documents in meteor through collection.insert()?
                            
                                MongoDB ReplicaSet - PRIMARY role falls to SECONDARY when only PRIMARY is left
                            
                                A MongoDB "userAdminAnyDatabase" user cannot admin users in "any database". Why?
                            
                                Build dynamic queries with Spring Data MongoDB Criteria
                            
                                MongoDB aggregate with conditional sums
                            
                                Mongoexport is not working for collection
                            
                                Mongoose bulk update operation
                            
                                Cannot serialize LocalDate in Mongodb
                            
                                Shorten ObjectId in node.js and mongoose
                            
                                MongoDB id remains null after InsertOneAsync
                            
                                MongoDB returns error "Can't canonicalize query" for sort function
                            
                                Adding values at front with $addToSet modifier
                            
                                Store NoSQL data on SQL Server?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Using stored JavaScript functions in the Aggregation pipeline, MapReduce or runCommand

Tags:

mongodb

mongodb-query

aggregation-framework

Prakash Thapa

People also ask

1 Answers

Blakes Seven

Recent Activity

Donate For Us