I have a simple query, which selects top 200 rows ordered by one of the columns filtered by other indexed column. The confusion is why is that the query plan in PL/SQL Developer shows that this index is used only when I'm selecting all rows, e.g.:
SELECT * FROM
(
SELECT *
FROM cr_proposalsearch ps
WHERE UPPER(ps.customerpostcode) like 'MK3%'
ORDER BY ps.ProposalNumber DESC
)
WHERE ROWNUM <= 200
Plan shows that it uses index CR_PROPOSALSEARCH_I1, which is an index on two columns: PROPOSALNUMBER & UPPER(CUSTOMERNAME), this takes 0.985s to execute:
If I get rid of ROWNUM condition, the plan is what I expect and it executes in 0.343s:
Where index XIF25CR_PROPOSALSEARCH is on CR_PROPOSALSEARCH (UPPER(CUSTOMERPOSTCODE));
How come?
EDIT: I have gathered statistics on cr_proposalsearch
table and both query plans now show that they use XIF25CR_PROPOSALSEARCH
index.
Including the ROWNUM changes the optimizer's calculations about which is the more efficient path.
When you do a top-n query like this, it doesn't necessarily mean that Oracle will get all the rows, fully sort them, then return the top ones. The COUNT STOPKEY
operation in the execution plan indicates that Oracle will only perform the underlying operations until it has found the number of rows you asked for.
The optimizer has calculated that the full query will acquire and sort 77K rows. If it used this plan for the top-n query, it would have to do a large sort of those rows to find the top 200 (it wouldn't necessarily have to fully sort them, as it wouldn't care about the exact order of rows past the top; but it would have to look over all of those rows).
The plan for the top-n query uses the other index to avoid having to sort at all. It considers each row in order, checks whether it matches the predicate, and if so returns it. When it's returned 200 rows, it's done. Its calculations have indicated that this will be more efficient for getting a small number of rows. (It may not be right, of course; you haven't said what the relative performance of these queries is.)
If the optimizer were to choose this plan when you ask for all rows, it would have to read through the entire index in descending order, getting each row from the table by ROWID as it goes to check against the predicate. This would result in a lot of extra I/O and inspecting many rows that would not be returned. So in this case, it decides that using the index on customerpostcode
is more efficient.
If you gradually increase the number of rows to be returned from the top-n query, you will probably find a tipping point where the plan switches from the first to the second. Just from the costs of the two plans, I'd guess this might be around 1,200 rows.
If you are sure your stats are up to date and that the index is selective enough, you could tell oracle to use the index
SELECT *
FROM (SELECT /*+ index(ps XIF25CR_PROPOSALSEARCH) */ *
FROM cr_proposalsearch ps
WHERE UPPER (ps.customerpostcode) LIKE 'MK3%'
ORDER BY ps.proposalnumber DESC)
WHERE ROWNUM <= 200
(I would only recommend this approach as a last resort)
If I were doing this I would first tkprof the query to see actually how much work it is doing,
e.g: the cost of index range scans could be way off
forgot to mention.... You should check the actual cardinality:
SELECT count(*) FROM cr_proposalsearch ps WHERE UPPER(ps.customerpostcode) like 'MK3%'
and then compare it to the cardinality in the query plan.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With