The data type of the field is String. I would like to find the length of the longest and shortest value for a field in mongoDB.
I have totally 500000 documents in my collection.
As for the logical condition, there are String Aggregation Operators that you can use $strLenCP operator to check the length of the string. If the length is $gt a specified value, then this is a true match and the document is "kept". Otherwise it is "pruned" and discarded.
The length value displays the length of the string in the “name” column. For example: The length of the string 'Dallas Mavs' is 11.
MongoDB connector limits text field to 255 characters.
$expr can build query expressions that compare fields from the same document in a $match stage. If the $match stage is part of a $lookup stage, $expr can compare fields using let variables. See Perform Multiple Joins and a Correlated Subquery with $lookup for an example.
In modern releases MongoDB has the $strLenBytes
or $strLenCP
aggregation operators than allow you to simply do:
Class.collection.aggregate([
{ "$group" => {
"_id" => nil,
"max" => { "$max" => { "$strLenCP" => "$a" } },
"min" => { "$min" => { "$strLenCP" => "$a" } }
}}
])
Where "a"
is the string property in your document you want to get the min and max length from.
To output the minimum and maximum length, the best approach available is to use mapReduce with a few tricks to just keep the values.
First you define a mapper function which is just really going to output a single item from your collection to reduce the load:
map = Q%{
function () {
if ( this.a.length < store[0] )
store[0] = this.a.length;
if ( this.a.length > store[1] )
store[1] = this.a.length;
if ( count == 0 )
emit( null, 0 );
count++;
}
}
Since this is working mostly with a globally scoped variable keeping the min and max lengths you just want to substitute this in a finalize
function on the single document emitted. There is no reduce stage, but define a "blank" function for this even though it is not called:
reduce = Q%{ function() {} }
finalize = Q%{
function(key,value) {
return {
min: store[0],
max: store[1]
};
}
}
Then call the mapReduce operation:
Class.map_reduce(map,reduce).out(inline: 1).finalize(finalize).scope(store: [], count: 0)
So all the work is done on the server and not by iterating results sent to the client application. On a small set like this:
{ "_id" : ObjectId("543e8ee7ddd272814f919472"), "a" : "this" }
{ "_id" : ObjectId("543e8eedddd272814f919473"), "a" : "something" }
{ "_id" : ObjectId("543e8ef6ddd272814f919474"), "a" : "other" }
You get a result like this (shell output, but much the same for the driver ):
{
"results" : [
{
"_id" : null,
"value" : {
"min" : 4,
"max" : 9
}
}
],
"timeMillis" : 1,
"counts" : {
"input" : 3,
"emit" : 1,
"reduce" : 0,
"output" : 1
},
"ok" : 1
}
So mapReduce allows the JavaScript processing on the server to do this fairly quickly, reducing your network traffic. There is no other native way at present for MongoDB to return a string length right now, so the JavaScript processing is necessary on the server.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With