The document I am working on is extremely large. It collects user input from an extremely long survey (like survey monkey) and stores the answers in a mongodb database.
I am unsurprisingly getting the following error
Error: Document exceeds maximal allowed bson size of 16777216 bytes
If I cannot change the fields in my document is there anything I can do? Is there some way to compress down the document, by removing white space or something like that?
Edit
Here is the structure of the document
Schema({
id : { type: Number, required: true },
created: { type: Date, default: Date.now },
last_modified: { type: Date, default: Date.now },
data : { type: Schema.Types.Mixed, required: true }
});
An example of the data field:
{
id: 65,
question: {
test: "some questions",
answers: [2,5,6]
}
// there could be thousands of these question objects
}
The maximum size an individual document can be in MongoDB is 16MB with a nested depth of 100 levels. Edit: There is no max size for an individual MongoDB database.
The maximum BSON document size in MongoDB is 16 MB. Users should avoid certain application patterns that would allow documents to grow unbounded.
Fixed-size collections are called capped collections in MongoDB. While creating a collection, the user must specify the collection's maximum size in bytes and the maximum number of documents that it would store. If more documents are added than the specified capacity, the existing ones are overwritten.
As you know, MongoDB stores data in a document. The limit for one document is 16Mb. You can also use GridFS to store large files that can exceed 16Mb. It will store them in multiple chunks.
One thing you can do is to build your own mongoDB :-). Mongodb is an open source and the limitation about the size of a document is rather arbitrary to enforce a better schema design. You can just modify this line and build it for yourself. Be careful with this.
The most straight forward idea is to have each small question in a different document with a field which reference to its parent.
Another idea is to limit number of documents in the parent. Lets say you limit is N elements then the parent looks like this:
{
_id : ObjectId(),
id : { type: Number, required: true },
created: { type: Date, default: Date.now }, // you can store it only for the first element
last_modified: { type: Date, default: Date.now }, // the same here
data : [{
id: 65,
question: {
test: "some questions",
answers: [2,5,6]
}
}, ... up to N of such things {}
]
}
This way modifying number N you can make sure that you will be in 16 MB of BSON. And in order to read the whole survey you can select
db.coll.find({id: the Id you need})
and then combine the whole survey on the application level. Also do not forget to ensureIndex on id
.
Try different things, do a benchmark on your data and see what works for you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With