Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How could I write aggregation without exceeds maximum document size?

I got exceeds maximum document size problem exception by the query as follows,

pipe = [
    {"$match": { "birthday":{"$gte":datetime.datetime(1987, 1, 1, 0, 0)} }}
    ]
res =db.patients.aggregate(pipe,allowDiskUse=True)

I fixed it by adding the $project operator,

However what if the document still over 16MB even I use $project ?

What can I do ? any idea ? Thank you

pipe = [
    {"$project": {"birthday":1, "id":1}
    },
    {"$match": { "birthday":{"$gte":datetime.datetime(1987, 1, 1, 0, 0)} }
     }
    ]
res =db.patients.aggregate(pipe,allowDiskUse=True)

Exception

OperationFailure: command SON([('aggregate', 'patients'), ('pipeline', [{'$match': {'birthday': {'$gte': datetime.datetime(1987, 1, 1, 0, 0)}}}]), ('allowDiskUse', True)]) on namespace tw_insurance_security_development.$cmd failed: exception: aggregation result exceeds maximum document size (16MB)
like image 555
newBike Avatar asked Apr 15 '15 07:04

newBike


2 Answers

By default the result of aggregations are returned to you in a single BSON document, which is where the size restriction comes from. If you need to return more than that you can either:

  • have the results be output to a collection. You do this by finishing your pipeline with

    {"$out": "some-collection-name"}

    You then query that collection as normal (you'll need to delete it yourself when you're done with it)

  • have the results returned as a cursor, by specifying useCursor=True when you call aggregate.

Both of these options require mongodb 2.6: if you are still running mongodb 2.4 then this is just a fundamental limit of aggregations.

like image 175
Frederick Cheung Avatar answered Oct 06 '22 06:10

Frederick Cheung


As @Frederick said requires mongo 2.6 at least, For further reference, here is the link from mongo documentation, which works similar to runCommand way but with db.collection.aggreagate, note that for document limit use "cursor" option, for sort limit use "allowDiskUse" option.

like image 24
Ahmet Yeşil Avatar answered Oct 06 '22 05:10

Ahmet Yeşil