I am trying to compute an average value from a collection using the mongodb java driver, like this:
DBObject condition =
new BasicDBObject("pluginIdentifier", plugin.getIdentifier());
DBObject initial = new BasicDBObject();
initial.put("count", 0);
initial.put("totalDuration", 0);
String reduce = "function(duration, out) { out.count++;
out.totalDuration+=duration.floatApprox; }";
String finalize = "function(out) { out.avg = out.totalDuration.floatApprox /
out.count; }";
DBObject avg = durationEntries.group(
new BasicDBObject("pluginIdentifier", true),
condition, initial, reduce, finalize);
System.out.println(avg);
"duration" is a NumberLong (in java, it is a Long, probably the java driver converts it). I figured out after some searching that in order to extract the number, using .floatApprox was one way to go, and this also works in the mongodb console:
> db.DurationEntries.findOne().duration.floatApprox
5
However, running the above java code won't compute an average, but returns this instead
[{"pluginIdentifier":"dummy", "count":7.0, "totalDuration":NaN, "avg":NaN}]
I tried several variations, with and without .floatApprox, but have only been able to obtain some weird string concatenations until now.
My question is: what am I doing wrong / how should I go about to calculate the average of one NumberLong column?
If you're having problems with map/reduce you should probably drop down into the mongodb console, work it out there and then translate that into your driver.
Take, for example, the following documents:
db.tasks.find()
{ "_id" : ObjectId("4dd51c0a3f42cc01ab0e6506"), "duration" : 10, "name" : "StartProcess", "date" : "20110501" }
{ "_id" : ObjectId("4dd51c0e3f42cc01ab0e6507"), "duration" : 11, "name" : "StartProcess", "date" : "20110502" }
{ "_id" : ObjectId("4dd51c113f42cc01ab0e6508"), "duration" : 12, "name" : "StartProcess", "date" : "20110503" }
You would write the mapReduce to calculate the average duration of StartProcess as follows:
m = function (){
emit( this.name , { totalDuration : this.duration , num : 1 } );
};
r = function (name, values){
var n = {totalDuration : 0, num : 0};
for ( var i=0; i<values.length; i++ ){
n.totalDuration += values[i].totalDuration;
n.num += values[i].num;
}
return n;
};
f = function(who, res){
res.avg = res.totalDuration / res.num;
return res;
};
Then, assuming you're using MongoDB 1.7 or above:
db.tasks.mapReduce( m, r, { finalize : f, out : {inline : 1} });
Would give you the following answer:
"results" : [
{
"_id" : "StartProcess",
"value" : {
"totalDuration" : 33,
"num" : 3,
"avg" : 11
}
}
]
If this doesn't help, can you post your map function and document structure.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With