Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Speed of paged queries in Oracle

This is a never-ending topic for me and I'm wondering if I might be overlooking something. Essentially I use two types of SQL statements in an application:

  1. Regular queries with a "fallback" limit
  2. Sorted and paged queries

Now, we're talking about some queries against tables with several million records, joined to 5 more tables with several million records. Clearly, we hardly want to fetch all of them, that's why we have the above two methods to limit user queries.

Case 1 is really simple. We just add an additional ROWNUM filter:

WHERE ...
  AND ROWNUM < ?

That's quite fast, as Oracle's CBO will take this filter into consideration for its execution plan and probably apply a FIRST_ROWS operation (similar to the one enforced by the /*+FIRST_ROWS*/ hint.

Case 2, however is a bit more tricky with Oracle, as there is no LIMIT ... OFFSET clause as in other RDBMS. So we nest our "business" query in a technical wrapper as such:

SELECT outer.* FROM (
  SELECT * FROM (
    SELECT inner.*, ROWNUM as RNUM, MAX(ROWNUM) OVER(PARTITION BY 1) as TOTAL_ROWS
    FROM (
      [... USER SORTED business query ...]
    ) inner
  ) 
  WHERE ROWNUM < ?
) outer
WHERE outer.RNUM > ?

Note that the TOTAL_ROWS field is calculated to know how many pages we will have even without fetching all data. Now this paging query is usually quite satisfying. But every now and then (as I said, when querying 5M+ records, possibly including non-indexed searches), this runs for 2-3minutes.

EDIT: Please note, that a potential bottleneck is not so easy to circumvent, because of sorting that has to be applied before paging!

I'm wondering, is that state-of-the-art simulation of LIMIT ... OFFSET, including TOTAL_ROWS in Oracle, or is there a better solution that will be faster by design, e.g. by using the ROW_NUMBER() window function instead of the ROWNUM pseudo-column?

like image 885
Lukas Eder Avatar asked May 17 '11 15:05

Lukas Eder


People also ask

How do I make Oracle queries run faster?

One of the primary ways in which you can optimize your SELECT query is to ensure you only include the columns you really need. Avoid using SELECT * where possible, as this will pull out a large amount of data you probably don't need to deal with. You should also avoid using SELECT DISTINCT where possible.

Does pagination improve performance?

Thanks to pagination, we can split our large dataset into chunks ( or pages ) that we can gradually fetch and display to the user, thus reducing the load on the database. Pagination also solves a lot of performance issues both on the client and server-side!

Why with clause is faster in Oracle?

If the WITH clause is used to do something that would otherwise need PL/SQL then it will be better than PL/SQL. The optimizer decides execution plan. And at times have no performance penalty, but at other times it may do something more complex.


2 Answers

The main problem with Case 2 is that in many cases the whole query result set has to be obtained and then sorted before the first N rows can be returned - unless the ORDER BY columns are indexed and Oracle can use the index to avoid a sort. For a complex query and a large set of data this can take some time. However there may be some things you can do to improve the speed:

  1. Try to ensure that no functions are called in the inner SQL - these may get called 5 million times just to return the first 20 rows. If you can move these function calls to the outer query they will be called less.
  2. Use a FIRST_ROWS_n hint to nudge Oracle into optimising for the fact that you will never return all the data.

EDIT:

Another thought: you are currently presenting the user with a report that could return thousands or millions of rows, but the user is never realistically going to page through them all. Can you not force them to select a smaller amount of data e.g. by limiting the date range selected to 3 months (or whatever)?

like image 172
Tony Andrews Avatar answered Oct 11 '22 15:10

Tony Andrews


You might want to trace the query that takes a lot of time and look at its explain plan. Most likely the performance bottleneck comes from the TOTAL_ROWS calculation. Oracle has to read all the data, even if you only fetch one row, this is a common problem that all RDBMS face with this type of query. No implementation of TOTAL_ROWS will get around that.

The radical way to speed up this type of query is to forego the TOTAL_ROWS calculation. Just display that there are additional pages. Do your users really need to know that they can page through 52486 pages? An estimation may be sufficient. That's another solution, implemented by google search for example: estimate the number of pages instead of actually counting them.

Designing an accurate and efficient estimation algorithm might not be trivial.

like image 29
Vincent Malgrat Avatar answered Oct 11 '22 15:10

Vincent Malgrat