Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is Mongoose not scalable with document array editing and version control?

I am developing a web application with Node.js and MongoDB/Mongoose. Our most used Model, Record, has many subdocument arrays. Some of these, for instance, include "Comment", "Bookings", and "Subscribers".

In the client side application, whenever the user hits the "delete" button it fires off an AJAX request to the delete route for that specific comment. The problem I am running into is that, when many of these AJAX calls come in at once, Mongoose fails with a "Document not found" error on some (but not all) of the calls.

This only happens when the calls are made rapidly and many at a time. I think this is due to the version in Mongoose causing document conflicts. Our current process for a delete is:

  1. Fetch the document using Record.findById()
  2. Remove the subdocument from the appropriate array (using, say, comment.remove())
  3. Call record.save()

I have found a solution where I can manually update the collection using Record.findByIdAndUpdate and then using the $pull operator. However, this means we can't use any of mongoose's middleware and loose the version control entirely. And the more I think about it, the more I realize situations where this would happen and I would have to use Mongoose's wrapper functions like findByIdAndUpdate or findAndRemove. The only other solution I can think of would be to put the removal attempt into a while loop and hope it works, which seems like a very poor fix.

Using the Mongoose wrappers doesn't really solve my problem as it won't allow me to use any sort of Middleware or hooks at all then, which is basically one of the huge benefits of using Mongoose.

Does this mean that Mongoose is essentially useless for anything of with rapid editing and I might as well just use native MongoDB drivers? Am I misunderstanding Mongoose's limitations? How could I solve this problem?

like image 312
Chris Foster Avatar asked Dec 27 '22 07:12

Chris Foster


2 Answers

Mongoose's versioned document array editing is not scalable for the simple reason that it's not an atomic operation. As a result, the more array edit activity you have, the more likely it is that two edits will collide and you'll suffer the overhead of retry/recovery from that in your code.

For scalable document array manipulation, you have to use update with the atomic array update operators: $pull[All], $push[All], $pop, $addToSet, and $. Of course, you can also use these operators with the atomic findAndModify-based methods of findByIdAndUpdate and findOneAndUpdate if you also need the original or resulting doc.

As you mentioned, the big downside of using update instead of findOne+save is that none of your Mongoose middleware and validation is executed during an update. But I don't see that you have any choice if you want a scalable system. I'd much rather manually duplicate some middleware and validation logic for the update case than have to suffer the scalability penalties of using Mongoose's versioned document array editing. Hey, at least you still get the benefits of Mongoose's schema-based type casting on updates!

like image 70
JohnnyHK Avatar answered Dec 31 '22 09:12

JohnnyHK


I think, from our own experiences, the answer to your question is "yes". Mongoose is not scalable for rapid array-based updates.

Background

We're experiencing the same issue at HabitRPG. After a recent surge in user growth (bringing our DB to 6gb), we started experiencing VersionError for many array-based updates (background on VersionError). ensureIndex({_id:1,__v1:1}) helped a bit, but that tapered as yet more users joined. It would appear to me Mongoose is indeed not scalable for array-based updates. You can see our whole investigation process here.

Solution

If you can afford moving from an array to an object, do that. Eg, comments: Schema.Types.Array => comments: Schema.Types.Mixed, and sort by post.comments.{ID}.date, or even a manual post.comments.{ID}.position if necessary.

If you're stuck with arrays:

  1. db.collection.ensureIndex({_id:1,__v:1})
  2. Use your methods described above. You won't benefit from hooks and validations, but there are worse things.
like image 21
lefnire Avatar answered Dec 31 '22 09:12

lefnire