Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Mongo DB handle a large array field?

I'm trying to store a list of ObjectIds in a document as an array field.

I understand Mongo DB has a 4MB size limit for single documents. So considering the length of ObjectId is 12 bytes, a document should be able to handle more than 300,000 entries in one array field. (Let me know if the calculation is off).

If the number of entries in the array gets close to that limit, what kind of performance can I expect? Especially when the field is indexed? Any memory issues?


Typical queries would look like below:

Query by a single value

db.myCollection.find(
  {
    myObjectIds: ObjectId('47cc67093475061e3d95369d')
  }
);

Query by multiple values

db.myCollection.find(
  {
    myObjectIds: {$in: [ObjectId('47cc67093475061e3d95369d'), ...]}
  }
);

Add a new value to multiple documents

db.myCollection.update(
  {
    _id: {$in: [ObjectId('56cc67093475061e3d95369d'), ...]}
  },
  {
    $addToSet: {myObjectIds: ObjectId('69cc67093475061e3d95369d')}
  }
);


like image 350
Jaepil Avatar asked Mar 15 '11 07:03

Jaepil


People also ask

How does MongoDB handle large data?

MongoDB handles real-time data analysis in the most efficient way hence suitable for Big Data. For instance, geospatial indexing enables an analysis of GPS data in real time.

How big can an array be in MongoDB?

The 16 MB limit is for each document. When you do a insertMany the array passed as an argument to the method holds multiple documents - each array element is a document.

Does MongoDB support array?

MongoDB also allows indexing the array elements - in this case, fields of the comment objects of the comments array. For example, if you are querying on the comments by "comments. user" and need fast access, you can create an index for that field. Indexes on array fields are called as Multikey Indexes.

How much data can MongoDB handle?

The maximum size an individual document can be in MongoDB is 16MB with a nested depth of 100 levels.


2 Answers

TBH, I think the best thing you can do is to benchmark it. Create some dummy data, and test the performance as you increase the number of items in the array. It may be quicker to knock up a test in your environment - than wait for an answer here

It is one thing on my TODO list to investigate and blog about, but I haven't got round to it yet. If you do, I'd definitely be interested to see what your findings are! Likewise, if I get round to it soon I will post the results here too.

like image 109
AdaTheDev Avatar answered Oct 10 '22 15:10

AdaTheDev


With the release of mongo 2.4 you can use capped arrays. On insert, you can tell mongo to $sort and $slice the array to keep it to a fixed length based on your criteria (if you don't care about throwing data away). For example, you could use this to save the most recent N entries in a data log.

like image 26
Josh Liptzin Avatar answered Oct 10 '22 13:10

Josh Liptzin