Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does pagination of results in databases work?

This is a general question that applies to MySQL, Oracle DB or whatever else might be out there.

I know for MySQL there is LIMIT offset,size; and for Oracle there is 'ROW_NUMBER' or something like that.

But when such 'paginated' queries are called back to back, does the database engine actually do the entire 'select' all over again and then retrieve a different subset of results each time? Or does it do the overall fetching of results only once, keeps the results in memory or something, and then serves subsets of results from it for subsequent queries based on offset and size?

If it does the full fetch every time, then it seems quite inefficient.

If it does full fetch only once, it must be 'storing' the query somewhere somehow, so that the next time that query comes in, it knows that it has already fetched all the data and just needs to extract next page from it. In that case, how will the database engine handle multiple threads? Two threads executing the same query?

I am very confused :(

like image 391
shikhanshu Avatar asked Jun 13 '18 21:06

shikhanshu


1 Answers

Yes, the query is executed over again when you run it with a different OFFSET.

Yes, this is inefficient. Don't do that if you have a need to paginate through a large result set.

I'd suggest doing the query once, with a large LIMIT — enough for 10 or 12 pages. Then save the result in a cache. When the user wants to advance through several pages, then your application can fetch the 10-12 pages you saved in the cache and display the page the user wants to see. That is usually much faster than running the SQL query for each page.

This works well if, like most users, your user reads only a few pages and then changes their query.


Re your comment:

By cache I mean something like Memcached or Redis. A high-speed, in-memory key/value store.

MySQL views don't store anything, they're more like a macro that runs a predefined query for you.

Oracle supports materialized views, so that might work better, but querying the view would have the overhead of interpreting an SQL query.

A simpler in-memory cache should be much faster.

like image 151
Bill Karwin Avatar answered Sep 28 '22 18:09

Bill Karwin