Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB ranged pagination

It's said that using skip() for pagination in MongoDB collection with many records is slow and not recommended.

Ranged pagination (based on >_id comparsion) could be used

db.items.find({_id: {$gt: ObjectId('4f4a3ba2751e88780b000000')}}); 

It's good for displaying prev. & next buttons - but it's not very easy to implement when you want to display actual page numbers 1 ... 5 6 7 ... 124 - you need to pre-calculate from which "_id" each page starts.

So I have two questions:

1) When should I start worry about that? When there're "too many records" with noticeable slowdown for skip()? 1 000? 1 000 000?

2) What is the best approach to show links with actual page numbers when using ranged pagination?

like image 727
Roman Avatar asked Mar 14 '12 13:03

Roman


People also ask

What is the use of the limit () and Skip () method?

The limit() function in MongoDB is used to specify the maximum number of results to be returned. Only one parameter is required for this function.to return the number of the desired result. Sometimes it is required to return a certain number of results after a certain number of documents. The skip() can do this job.

What is Skip method in MongoDB?

MongoDB skip() is used when we required a certain number of results after a certain number of documents simultaneously we have use skip method in MongoDB. If we want to skip a certain number of documents from the collection, skip method will skip the specified documents that we have used with the MongoDB skip method.

What is limit in pagination?

The limit option allows you to limit the number of rows returned from a query, while offset allows you to omit a specified number of rows before the beginning of the result set. Using both limit and offset skips both rows as well as limit the rows returned.


2 Answers

Good question!

"How many is too many?" - that, of course, depends on your data size and performance requirements. I, personally, feel uncomfortable when I skip more than 500-1000 records.

The actual answer depends on your requirements. Here's what modern sites do (or, at least, some of them).

First, navbar looks like this:

1 2 3 ... 457 

They get final page number from total record count and page size. Let's jump to page 3. That will involve some skipping from the first record. When results arrive, you know id of first record on page 3.

1 2 3 4 5 ... 457 

Let's skip some more and go to page 5.

1 ... 3 4 5 6 7 ... 457 

You get the idea. At each point you see first, last and current pages, and also two pages forward and backward from the current page.

Queries

var current_id; // id of first record on current page.  // go to page current+N db.collection.find({_id: {$gte: current_id}}).               skip(N * page_size).               limit(page_size).               sort({_id: 1});  // go to page current-N // note that due to the nature of skipping back, // this query will get you records in reverse order  // (last records on the page being first in the resultset) // You should reverse them in the app. db.collection.find({_id: {$lt: current_id}}).               skip((N-1)*page_size).               limit(page_size).               sort({_id: -1}); 
like image 149
Sergio Tulentsev Avatar answered Oct 30 '22 10:10

Sergio Tulentsev


It's hard to give a general answer because it depends a lot on what query (or queries) you are using to construct the set of results that are being displayed. If the results can be found using only the index and are presented in index order then db.dataset.find().limit().skip() can perform well even with a large number of skips. This is likely the easiest approach to code up. But even in that case, if you can cache page numbers and tie them to index values you can make it faster for the second and third person that wants to view page 71, for example.

In a very dynamic dataset where documents will be added and removed while someone else is paging through data, such caching will become out-of-date quickly and the limit and skip method may be the only one reliable enough to give good results.

like image 26
Tad Marshall Avatar answered Oct 30 '22 11:10

Tad Marshall