Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count fields in a MongoDB Collection

I have a collection of documents like this one:

{
    "_id" : ObjectId("..."),
    "field1": "some string",
    "field2": "another string",
    "field3": 123
}

I'd like to be able to iterate over the entire collection, and find the entire number of fields there are. In this example document there are 3 (I don't want to include _id), but it ranges from 2 to 50 fields in a document. Ultimately, I'm just looking for the average number of fields per document.

Any ideas?

like image 489
adamb0mb Avatar asked Dec 10 '12 23:12

adamb0mb


People also ask

Is count faster than find MongoDB?

find({}). count() more fast then collection.

Can we use count with aggregate function in MongoDB?

MongoDB aggregate $count element in array For this, MongoDB provides the $size aggregation to count and returns the total number of items in an array. Let's get understand this with the help of an example. Example: The subsequent documents were inserted into the Test collection.

How do you count fields in MongoDB?

First stage $project is to turn all keys into array to count fields. Second stage $group is to sum the number of keys/fields in the collection, also the number of documents processed. Third stage $project is subtracting the total number of fields with the total number of documents (As you don't want to count for _id ).


2 Answers

Iterate over the entire collection, and find the entire number of fields there are

Now you can utilise aggregation operator $objectToArray (SERVER-23310) to turn keys into values and count them. This operator is available in MongoDB v3.4.4+

For example:

db.collection.aggregate([
         {"$project":{"numFields":{"$size":{"$objectToArray":"$$ROOT"}}}}, 
         {"$group":{"_id":null, "fields":{"$sum":"$numFields"}, "docs":{"$sum":1}}}, 
         {"$project":{"total":{"$subtract":["$fields", "$docs"]}, _id:0}}
])

First stage $project is to turn all keys into array to count fields. Second stage $group is to sum the number of keys/fields in the collection, also the number of documents processed. Third stage $project is subtracting the total number of fields with the total number of documents (As you don't want to count for _id ).

You can easily add $avg to count for average on the last stage.

like image 192
Wan Bachtiar Avatar answered Nov 12 '22 03:11

Wan Bachtiar


PRIMARY> var count = 0;
PRIMARY> db.my_table.find().forEach( function(d) { for(f in d) { count++; } });
PRIMARY> count
1074942

This is the most simple way I could figure out how to do this. On really large datasets, it probably makes sense to go the Map-Reduce path. But, while your set is small enough, this'll do.

This is O(n^2), but I'm not sure there is a better way.

like image 41
adamb0mb Avatar answered Nov 12 '22 03:11

adamb0mb