MongoDB ranged pagination

Tags:

It's said that using skip() for pagination in MongoDB collection with many records is slow and not recommended.

Ranged pagination (based on >_id comparsion) could be used

db.items.find({_id: {$gt: ObjectId('4f4a3ba2751e88780b000000')}});

It's good for displaying prev. & next buttons - but it's not very easy to implement when you want to display actual page numbers 1 ... 5 6 7 ... 124 - you need to pre-calculate from which "_id" each page starts.

So I have two questions:

1) When should I start worry about that? When there're "too many records" with noticeable slowdown for skip()? 1 000? 1 000 000?

2) What is the best approach to show links with actual page numbers when using ranged pagination?

727

asked Mar 14 '12 13:03

Roman

2 Answers

Good question!

"How many is too many?" - that, of course, depends on your data size and performance requirements. I, personally, feel uncomfortable when I skip more than 500-1000 records.

The actual answer depends on your requirements. Here's what modern sites do (or, at least, some of them).

First, navbar looks like this:

1 2 3 ... 457

They get final page number from total record count and page size. Let's jump to page 3. That will involve some skipping from the first record. When results arrive, you know id of first record on page 3.

1 2 3 4 5 ... 457

Let's skip some more and go to page 5.

1 ... 3 4 5 6 7 ... 457

You get the idea. At each point you see first, last and current pages, and also two pages forward and backward from the current page.

Queries

var current_id; // id of first record on current page.  // go to page current+N db.collection.find({_id: {$gte: current_id}}).               skip(N * page_size).               limit(page_size).               sort({_id: 1});  // go to page current-N // note that due to the nature of skipping back, // this query will get you records in reverse order  // (last records on the page being first in the resultset) // You should reverse them in the app. db.collection.find({_id: {$lt: current_id}}).               skip((N-1)*page_size).               limit(page_size).               sort({_id: -1});

149

answered Oct 30 '22 10:10

Sergio Tulentsev

It's hard to give a general answer because it depends a lot on what query (or queries) you are using to construct the set of results that are being displayed. If the results can be found using only the index and are presented in index order then db.dataset.find().limit().skip() can perform well even with a large number of skips. This is likely the easiest approach to code up. But even in that case, if you can cache page numbers and tie them to index values you can make it faster for the second and third person that wants to view page 71, for example.

In a very dynamic dataset where documents will be added and removed while someone else is paging through data, such caching will become out-of-date quickly and the limit and skip method may be the only one reliable enough to give good results.

answered Oct 30 '22 11:10

Tad Marshall

Related questions
                            
                                Is there a performance difference between 'let' and 'var' in JavaScript
                            
                                Why are HashSets of structs with nullable values incredibly slow?
                            
                                Why is this code using strlen heavily 6.5x slower with GCC optimizations enabled?
                            
                                Why is appending to a list bad?
                            
                                Array vs ArrayList in performance [duplicate]
                            
                                Will my iPhone app take a performance hit if I use Objective-C for low level code?
                            
                                Performance issue: Java vs C++
                            
                                WebSockets, UDP, and benchmarks
                            
                                Why does IntelliJ IDEA compile Scala so slowly? [closed]
                            
                                Why is the construction of std::optional<int> more expensive than a std::pair<int, bool>?
                            
                                Is it possible to get a history of queries made in postgres
                            
                                C++: Mysteriously huge speedup from keeping one operand in a register
                            
                                To GC or Not To GC
                            
                                Performance comparison of Thrift, Protocol Buffers, JSON, EJB, other?
                            
                                boolean[] vs. BitSet: Which is more efficient?
                            
                                Fastest way to check if a List<String> contains a unique String
                            
                                How to decrease build times / speed up compile time in Xcode?
                            
                                How does extern work in C#?
                            
                                Python's sum vs. NumPy's numpy.sum
                            
                                JDBC batch insert performance

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

MongoDB ranged pagination

Tags:

performance

mongodb

pagination