Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concurrency issues when removing dependent documents with mongoose middlewares

Suppose we have a simple application where users can create products and comment them. The schema for products and comments could be:

var productSchema = new mongoose.Schema({
  author_id: ObjectId,
  description: String
});

var commentSchema = new mongoose.Schema({
  product_id: ObjectId,
  author_id: ObjectId,
  message: String
});

We want to make sure that every comment refers to an existing product. This can be easily accomplished with mongoose pre save hook:

commentSchema.pre("save", function(next) {
  Product.count({ _id: this.product_id }, function(err, count) {
    if (err || !count) {
      next(new Error("Could not find product"));
    } else {
      next();
    }
  });
});

Also if a user removes a product, we want to remove all the comments on that product. This can be easily accomplished using a pre remove hook:

productSchema.pre("remove", function(next) {
  Comment.remove({ product_id: this._id }, next);
});

But what if user A removes a product and at the same time user B comments on that product?

The following could occur:

Call pre save hook for new comment, and check if product exists
Call pre remove hook for product, and remove all comments
In pre save hook, done checking: product actually exists, call next
Comment saved
In pre remove hook, done removing comments: call next
Product removed

The end result is that we have a comment that refers to a non-existent product.

This is just one of the many cases that would cause this to occur. How can this corner case be prevented?

like image 352
user7307621 Avatar asked Mar 01 '17 01:03

user7307621


1 Answers

Seems that using mongoose post hooks instead of pre hooks solves the problem:

commentSchema.post("save", function(comment) {
  Product.count({ _id: comment.product_id }, function(err, count) {
    if (err || !count) comment.remove();
  });
});

productSchema.post("remove", function(product) {
  Comment.remove({ product_id: product._id }).exec();
});

Let's see why this solves the problem by considering the four possible cases (that I can think of):

1) Comment gets saved before product is removed
2) Comment gets saved after product is removed but before post remove hook
3) Comment gets saved after product is removed and while post remove hook is 
   executing
4) Comment gets saved after product is removed and post remove hook executed
------------------------------------------------------------------------
In case 1, after the product is removed, the comment will be removed in the post 
remove hook.
In case 2, same, post remove hook will remove the comment.
In case 3, the comment post save hook will successfully remove the comment.
In case 4, same as case 3, post save hook removes the comment.

However there's still a little problem: what if something bad happens after the product is removed but before the post remove hook is executed? Say electricity goes off or something like that. In that case we will end up with comments that refer to a product that doesn't exist. To fix this we can keep the pre remove hook on products. This guarantess that a product is removed only and only if all the dependent comments were removed. However this does not handle concurrency problems, as pointed out by the OP, that's where our post remove hook comes to the rescue! So we need both:

productSchema.pre("remove", function(next) {
  var product = this;
  Comment.remove({ product_id: product._id }, next);
});

productSchema.post("remove", function(product) {
  Comment.remove({ product_id: product._id }).exec();
});

I wish this was it, but I can still think of a very remote case: what if a comment gets saved after the product is removed and post remove hook executed BUT just before the comment post save hook executes (which would remove the comment) the lights go off! We end up with a comment that refers to a product that does not exist! The odds for this to happen are incredibly low, but still..

If any one can think of a better way to handle concurrency, please improve my answer or write your own!

like image 115
cute_ptr Avatar answered Sep 18 '22 03:09

cute_ptr