Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB Aggregate Framework - Group by Year

I've been trying to use the aggregate function to group date fields by year:

db.identities.aggregate([
{
    $group : {
        _id : { year : {$year : "$birth_date"}},
        total : {$sum : 1}
        }
    }   
])

Some of my dates however fall before 1970 and being a Windows user I get a nasty error about gmtime:

{
    "errmsg" : "exception: gmtime failed - your system doesn't support dates before 1970",
    "code" : 16422,
    "ok" : 0
}

I know the obvious answer now is for me to get a virtual machine running or something but I was just curious if there were any work-arounds for windows (Windows 7 in my case). Failing that how much of a performance hit would storing the date as a nested object be i.e:

birth_date : {
  year : 1980,
  month : 12,
  day : 9
}

I'm not too sure how hectic that would be with indexes etc.

Any advice appreciated!

like image 700
backdesk Avatar asked Dec 16 '12 17:12

backdesk


People also ask

How do I sort data in MongoDB aggregation?

In MongoDB, the $sort stage is used to sort all the documents in the aggregation pipeline and pass a sorted order to the next stage of the pipeline. Lets take a closer look at the above syntax: The $sort stage accepts a document that defines the field or fields that will be used for sorting.

WHAT IS group in MongoDB aggregation?

Group operator (also known as an accumulator operator) is a crucial operator in the MongoDB language, as it helps to perform various transformations of data. It is a part of aggregation in MongoDB. MongoDB is an open-source NoSQL database management program. NoSQL is an alternative to traditional relational databases.

Is MongoDB aggregation slow?

mongodb - Mongo performance is extremely slow for an aggregation query - Stack Overflow. Stack Overflow for Teams – Start collaborating and sharing organizational knowledge.

Is aggregation fast in MongoDB?

On large collections of millions of documents, MongoDB's aggregation was shown to be much worse than Elasticsearch. Performance worsens with collection size when MongoDB starts using the disk due to limited system RAM. The $lookup stage used without indexes can be very slow.


1 Answers

Some versions of Windows have been known to work. By any chance, are you using a 32-bit OS? The code in question is here, and depends upon the gmtime_s() implementation.

If this collection is simply for aggregation queries, you can certainly get by with storing date components in an object. I'd suggest abbreviating the field names (e.g. y, m, d) to save on storage, since the field strings are present in each stored document. The trade-off here is that none of the aggregation date operators can be used. You may want to store the timestamp as a signed integer (e.g. ts) so that you can easily do range queries if necessary.

like image 165
jmikola Avatar answered Nov 01 '22 07:11

jmikola