How to get the most common distinct values of a key in query results.
Consider a collection 'collectionSample'
{
name : 'a',
value: 10,
word : 'baz'
},
{
name : 'a',
value: 65,
word : 'bar'
},
{
name : 'a',
value: 3,
word : 'foo'
},
{
name : 'b',
value: 110,
word : 'bar'
},
{
name : 'b',
value: 256,
word : 'baz'
}
Here I want to find the mode of key 'name', that is the most repeated distinct 'name'.
The result I'm hoping to get is like
{'most_common_distinct_val':a} //since a is count 3 and b is count 2
How to query it in NodeJs mongo client?
To count the unique values, use "distinct()" rather than "find()", and "length" rather than "count()". The first argument for "distinct" is the field for which to aggregate distinct values, the second is the conditional statement that specifies which rows to select.
distinct() considers each element of the array as a separate value. For instance, if a field has as its value [ 1, [1], 1 ] , then db. collection. distinct() considers 1 , [1] , and 1 as separate values.
You can use $addToSet with the aggregation framework to count distinct objects. Not a generic solution, if you have a large number of unique zip codes per result, this array would be very large. The question was to get the city with MOST zip codes for each state, not to get the actual zip codes.
To create a unique index, use the db. collection. createIndex() method with the unique option set to true .
As release of MongoDB 3.4, the following code can be simplified by using $sortByCount, which essentially equals to $group
+ $sort
. Code snippet:
col.aggregate([{
"$sortByCount": "$name"
}], ...);
The mongodb aggregation framework would do the job. Code sample:
var MongoClient = require("mongodb").MongoClient;
MongoClient.connect("mongodb://localhost/YourDB", function(err, db) {
var col = db.collection("YourCol");
col.aggregate([{
"$group": {_id: "$name", count: { "$sum": 1}}
}, {
"$sort": {count: -1}
}], function(err, docs) {
var keys = []
docs.forEach(function(doc) {
console.log(JSON.stringify(doc)); // do what you want here.
});
});
});
The aggregation framework uses different "filters" to filter out the result set. As you can see in the sample, there's an array of all these filters.
Here I have 2 filters, the first one:
{"$group": {_id: "$name", count: { "$sum": 1}}}
is to group your data by name and count the repeated times.
The 2nd one:
{"$sort": {count: -1}}
is to sort the result by repeated times (count).
if you want only the max repeated one record, you can add a filter there:
{"$limit": 1}
You can do a lot more things with the framework. refer to the doc about operators
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With