I'm trying to store a list of ObjectIds in a document as an array field.
I understand Mongo DB has a 4MB size limit for single documents. So considering the length of ObjectId is 12 bytes, a document should be able to handle more than 300,000 entries in one array field. (Let me know if the calculation is off).
If the number of entries in the array gets close to that limit, what kind of performance can I expect? Especially when the field is indexed? Any memory issues?
Typical queries would look like below:
Query by a single value
db.myCollection.find(
{
myObjectIds: ObjectId('47cc67093475061e3d95369d')
}
);
Query by multiple values
db.myCollection.find(
{
myObjectIds: {$in: [ObjectId('47cc67093475061e3d95369d'), ...]}
}
);
Add a new value to multiple documents
db.myCollection.update(
{
_id: {$in: [ObjectId('56cc67093475061e3d95369d'), ...]}
},
{
$addToSet: {myObjectIds: ObjectId('69cc67093475061e3d95369d')}
}
);
MongoDB handles real-time data analysis in the most efficient way hence suitable for Big Data. For instance, geospatial indexing enables an analysis of GPS data in real time.
The 16 MB limit is for each document. When you do a insertMany the array passed as an argument to the method holds multiple documents - each array element is a document.
MongoDB also allows indexing the array elements - in this case, fields of the comment objects of the comments array. For example, if you are querying on the comments by "comments. user" and need fast access, you can create an index for that field. Indexes on array fields are called as Multikey Indexes.
The maximum size an individual document can be in MongoDB is 16MB with a nested depth of 100 levels.
TBH, I think the best thing you can do is to benchmark it. Create some dummy data, and test the performance as you increase the number of items in the array. It may be quicker to knock up a test in your environment - than wait for an answer here
It is one thing on my TODO list to investigate and blog about, but I haven't got round to it yet. If you do, I'd definitely be interested to see what your findings are! Likewise, if I get round to it soon I will post the results here too.
With the release of mongo 2.4 you can use capped arrays. On insert, you can tell mongo to $sort and $slice the array to keep it to a fixed length based on your criteria (if you don't care about throwing data away). For example, you could use this to save the most recent N entries in a data log.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With