Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a logically equivalent and efficient version of this query without using a CTE?

I have a query on a postgresql 9.2 system that takes about 20s in it's normal form but only takes ~120ms when using a CTE.

I simplified both queries for brevity.

Here is the normal form (takes about 20s):

SELECT *
FROM tableA
WHERE (columna = 1 OR columnb = 2) AND
    atype = 35 AND
    aid IN (1, 2, 3)
ORDER BY modified_at DESC
LIMIT 25;

Here is the explain for this query: http://explain.depesz.com/s/2v8

The CTE form (about 120ms):

WITH raw AS (
    SELECT *
    FROM tableA
    WHERE (columna = 1 OR columnb = 2) AND
        atype = 35 AND
        aid IN (1, 2, 3)
)
SELECT *
FROM raw
ORDER BY modified_at DESC
LIMIT 25;

Here is the explain for the CTE: http://explain.depesz.com/s/uxy

Simply by moving the ORDER BY to the outer part of the query reduces the cost by 99%.

I have two questions: 1) is there a way to construct the first query without using a CTE in such a way that it is logically equivalent more performant and 2) what does this difference in performance say about how the planner is determining how to fetch the data?

Regarding the questions above, are there additional statistics or other planner hints that would help improve the performance of the first query?


Edit: Taking away the limit also causes the query to use a heap scan as opposed to an index scan backwards. Without the LIMIT the query completes in 40ms.

After seeing the effect of the LIMIT I tried with LIMIT 1, LIMIT 2, etc. The query performs in under 100ms when using LIMIT 1 and 10s+ with LIMIT > 1.

After thinking about this some more, question 2 boils down to why does the planner use an index scan backwards in one case and a bitmap heap scan + sort in another logically equivalent case? And how can I "help" the planner use the efficient plan in both cases?


Update: I accepted Craig's answer because it was the most comprehensive and helpful. The way I ended up solving the problem was by using a query that was practically equivalent though not logically equivalent. At the root of the issue was an index scan backwards of the index on modified_at. In order to inform the planner that this was not a good idea I add a predicate of the form WHERE modified_at >= NOW() - INTERVAL '1 year'. This included enough data for the application but prevented the planner from going down the backwards index scan path.

This was a much lower impact solution that prevented the need to rewrite the queries using either a sub query or a CTE. YMMV.

like image 289
Damon Snyder Avatar asked Jul 10 '13 23:07

Damon Snyder


People also ask

What is the difference between CTE and subquery?

A Common Table Expression (aka CTE, aka WITH statement) is a temporary data set to be used as part of a query. It only exists during the execution of that query; it cannot be used in other queries even within the same session (from Wikipedia). A subquery is a nested query; it's a query within a query (more Wikipedia).

What is CTE scan in PostgreSQL?

A "CTE scan" is a sequential scan of the materialized results of a CTE term (a named section like "blah" in a CTE like WITH blah AS (SELECT ...) . Materialized means that PostgreSQL has calculated the results and turned them into a temporary store of rows, it isn't just using the CTE like a view.


1 Answers

Here's why this is happening, with the following explanation current until at least 9.3 (if you're reading this and on a newer version, check to make sure it hasn't changed):

PostgreSQL doesn't optimize across CTE boundaries. Each CTE clause is run in isolation and its results are consumed by other parts of the query. So a query like:

WITH blah AS (
    SELECT * FROM some_table
)
SELECT *
FROM blah
WHERE id = 4;

will cause the full inner query to get executed. PostgreSQL won't "push down" the id = 4 qualification into the inner query. CTEs are "optimization fences" in that regard, which can be both good or bad; it lets you override the planner when you want to, but prevents you from using CTEs as simple syntactic cleanup for a deeply nested FROM subquery chain if you do need push-down.

If you rephrase the above as:

SELECT *
FROM (SELECT * FROM some_table) AS blah
WHERE id = 4;

using a sub-query in FROM instead of a CTE, Pg will push the qual down into the subquery and it'll all run nice and quickly.

As you have discovered, this can also work to your benefit when the query planner makes a poor decision. It appears that in your case a backward index scan of the table is immensely more expensive a bitmap or index scan of two smaller indexes followed by a filter and sort, but the planner doesn't think it will be so it plans the query to scan the index.

When you use the CTE, it can't push the ORDER BY into the inner query, so you're overriding its plan and forcing it to use what it thinks is an inferior execution plan - but one that turns out to be much better.

There's a nasty workaround that can be used for these situations called the OFFSET 0 hack, but you should only use it if you can't figure out a way to make the planner do the right thing - and if you have to use it, please boil this down to a self-contained test case and report it to the PostgreSQL mailing list as a possible query planner bug.

Instead, I recommend first looking at why the planner is making the wrong decision.

The first candidate is stats / estimates problems, and sure enough when we look at your problematic query plan there's a factor of 3500 mis-estimation of the expected result rows. That's big, but not impossibly big, though it's more interesting that you actually only get one row where the planner is expecting a non-trivial row set. That doesn't help us much, though; if the row count is lower than expected that means that choosing to use the index was a better choice than expected.

The main issue looks like it's not using the smaller, more selective indexes sierra_kilo and papa_lima because it sees the ORDER BY and thinks that it'll save more time doing a backward index scan and avoiding the sort than it really does. That makes sense given that there's only one matching row to sort! If it got the expected 3500 rows then it might've made more sense to avoid the sort, though that's still a fairly small rowset to just sort in memory.

Do you set any parameters like enable_seqscan, etc? If you do, unset them; they're for testing only and totally inappropriate for production use. If you aren't using the enable_ params I think it's worth raising this on the PostgreSQL mailing list pgsql-perform. The anonymized plans make this a bit difficult, though, especially since there's no gurantee that identifiers from one plan refer to the same objects in the other plan, and they don't match what you wrote in the query on the question. You'll want to produce a properly hand-done version where everything matches up before asking on the mailing list.

There's a fairly good chance that you'll need to provide the real values for anyone to help. If you don't want to do that on a public mailing list, there's another option available. (I should note that I work for one of them, per my profile).

like image 126
Craig Ringer Avatar answered Sep 18 '22 07:09

Craig Ringer