Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to avoid memory limit when skipping a large amount of records with Mongoose?

On a collection with over 100k records, when I query with Mongoose options like so:

contact.find({}, {}, {
  collation: {
    locale: 'en_US',
    strength: 1
  },
  skip: 90000,
  limit: 10,
  sort: {
    email: 1
  }
});

I get this error:

MongoError: Executor error during find command: OperationFailed: Sort operation used more than the maximum 33554432 bytes of RAM. Add an index, or specify a smaller limit.

But I do have an index on the email field:

{
  "v" : 2,
  "key" : {
    "email" : 1
  },
  "name" : "email_1",
    "ns" : "leadfox.contact",
    "background" : true
}

On the other hand when I query in the Mongo shell it works fine:

db.contact.find().sort({email: 1}).skip(90000).limit(10)
like image 538
Maxime Asselin Avatar asked Mar 09 '23 00:03

Maxime Asselin


1 Answers

What you are experiencing is because of skip. As you can see in documentation

The cursor.skip() method is often expensive because it requires the server to walk from the beginning of the collection or index to get the offset or skip position before beginning to return results. As the offset (e.g. pageNumber above) increases, cursor.skip() will become slower and more CPU intensive. With larger collections, cursor.skip() may become IO bound.

You should find a better approach instead of skip. As you are sorting documents with email field, you can write a range query with email field instead of skiplike that:

contact.find({ "email": { $gt: the_last_email_from_previous_query } }, {}, {
  collation: {
    locale: 'en_US',
    strength: 1
  },
  limit: 10,
  sort: {
    email: 1
  }
});

Update:

First of all. Like I said above, what you want is not possible. Mongodb says it, not me.

Secondly, I suggest you to search about modern pagination methods and people use cases. Your example in comment is absurd. No user should/would go directly to 790th page for any data. If they go directly a page like that, it most probably means , they covered data till 790th page and they want to continue. So even you are building a stateless system(like all modern systems these days) you should store some information about users last-point-view for your paginated data. This is an example approach(I am not saying best, it is just an example) based on user behavior.

Another approach, you can use (like most of modern pagination tables) you only allow user to navigate 5-6 pages forward or backward. So you can skip only 50-60 document in your query combined with $gtand $lton emailfield.

Another approach can be caching data in memory with some other tools.

I think you get the picture. Happy coding.

like image 141
barbakini Avatar answered Mar 10 '23 15:03

barbakini