Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most common distinct values mongodb

How to get the most common distinct values of a key in query results.

Consider a collection 'collectionSample'

{
    name : 'a',
    value: 10,
    word : 'baz'
},
{
    name : 'a',
    value: 65,
    word : 'bar'
},
{
    name : 'a',
    value: 3,
    word : 'foo'
},
{
    name : 'b',
    value: 110,
    word : 'bar'
},
{
    name : 'b',
    value: 256,
    word : 'baz'
}

Here I want to find the mode of key 'name', that is the most repeated distinct 'name'.

The result I'm hoping to get is like

 {'most_common_distinct_val':a}  //since a is count 3 and b is count 2

How to query it in NodeJs mongo client?

like image 464
Okky Avatar asked May 01 '14 05:05

Okky


People also ask

How can I count distinct values in MongoDB?

To count the unique values, use "distinct()" rather than "find()", and "length" rather than "count()". The first argument for "distinct" is the field for which to aggregate distinct values, the second is the conditional statement that specifies which rows to select.

What is the use of distinct in MongoDB?

distinct() considers each element of the array as a separate value. For instance, if a field has as its value [ 1, [1], 1 ] , then db. collection. distinct() considers 1 , [1] , and 1 as separate values.

How do I use distinct aggregation in MongoDB?

You can use $addToSet with the aggregation framework to count distinct objects. Not a generic solution, if you have a large number of unique zip codes per result, this array would be very large. The question was to get the city with MOST zip codes for each state, not to get the actual zip codes.

How do I set unique fields in MongoDB?

To create a unique index, use the db. collection. createIndex() method with the unique option set to true .


1 Answers

2017-08-01 Update

As release of MongoDB 3.4, the following code can be simplified by using $sortByCount, which essentially equals to $group + $sort. Code snippet:

col.aggregate([{
    "$sortByCount": "$name"
}], ...);

The mongodb aggregation framework would do the job. Code sample:

var MongoClient = require("mongodb").MongoClient;
MongoClient.connect("mongodb://localhost/YourDB", function(err, db) {
    var col = db.collection("YourCol");
    col.aggregate([{
        "$group": {_id: "$name", count: { "$sum": 1}}
    }, {
        "$sort": {count: -1}
    }], function(err, docs) {
        var keys = []
        docs.forEach(function(doc) {
            console.log(JSON.stringify(doc)); // do what you want here.
        });
    });
});

The aggregation framework uses different "filters" to filter out the result set. As you can see in the sample, there's an array of all these filters.
Here I have 2 filters, the first one:

{"$group": {_id: "$name", count: { "$sum": 1}}}

is to group your data by name and count the repeated times.
The 2nd one:

{"$sort": {count: -1}}

is to sort the result by repeated times (count).
if you want only the max repeated one record, you can add a filter there:

{"$limit": 1}

You can do a lot more things with the framework. refer to the doc about operators

like image 90
yaoxing Avatar answered Nov 09 '22 05:11

yaoxing