Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does Postgresql plpgsql/sql support short circuiting in the where clause?

If I have the following toy query

SELECT *
FROM my_tables
WHERE my_id in (
    SELECT my_other_id
    FROM my_other_tables
) AND some_slow_func(arg) BETWEEN 1 AND 2;

Would the first condition in the WHERE clause short circuit the second condition which would have a complex run time?

I'm working on some sql that is actually part of a FOR LOOP in plpgsql, and I could do iterations over all records that exist in the my_other_tables, and then test within the scope of the FOR LOOP with the some_slow_func(). But I'm curious if sql supports, or plpgsql supports short circuiting.

Some Research: I looked in the Postgres mailing lists and found this saying SQL in general doesn't support short circuiting:

http://www.postgresql.org/message-id/[email protected]

But one of the responses says that order can be enforced through subselects. I'm not exactly sure what he's speaking about. I know what a subselect is, but I'm not sure how order would be enforced? Could some one clarify this for me?

like image 664
Tim Sanders Avatar asked Feb 18 '13 19:02

Tim Sanders


1 Answers

As documented, the evaluation order in a WHERE clause is supposed to be unpredictable.

It's different with subqueries. With PostgreSQL older than version 12, the simplest and common technique to drive the evaluation order is to write a subquery in a CTE. To make sure that the IN(...) is evaluated first, your code could be written as:

WITH subquery AS
(select * from my_tables
  WHERE my_id in (SELECT my_other_id FROM my_other_tables)
)
SELECT * FROM subquery
  WHERE some_slow_func(arg) BETWEEN 1 AND 2;

Starting with PostgreSQL version 12, WITH subqueries may be inlined by the optimizer (see the doc page on WITH queries for all the details), and the non-inlining is only guaranteed when adding the MATERIALIZED clause:

WITH subquery AS MATERIALIZED
(select * ... the rest is similar as above)

Something else that you may tweak is the cost of your function to signal to the optimizer that it's slow. The default cost for a function is 100, and it can be altered with a statement like:

ALTER FUNCTION funcname(argument types) cost N;

where N is the estimated per-call cost, expressed in an arbitrary unit that should be compared to the Planner Cost Constants.

like image 114
Daniel Vérité Avatar answered Oct 03 '22 08:10

Daniel Vérité