Why is PostgreSQL calling my STABLE/IMMUTABLE function multiple times?

Tags:

I'm trying to optimise a complex query in PostgreSQL 9.1.2, which calls some functions. These functions are marked STABLE or IMMUTABLE and are called several times with the same arguments in the query. I assumed PostgreSQL would be smart enough to only call them once for each set of inputs - after all, that's the point of STABLE and IMMUTABLE, isn't it? But it appears that the functions are being called multiple times. I wrote a simple function to test this, which confirms it:

CREATE OR REPLACE FUNCTION test_multi_calls1(one integer)
RETURNS integer
AS $BODY$
BEGIN
    RAISE NOTICE 'Called with %', one;
    RETURN one;
END;
$BODY$ LANGUAGE plpgsql IMMUTABLE;


WITH data AS
(
    SELECT 10 AS num
    UNION ALL SELECT 10
    UNION ALL SELECT 20
)
SELECT test_multi_calls1(num)
FROM data;

Output:

NOTICE:  Called with 10
NOTICE:  Called with 10
NOTICE:  Called with 20

Why is this happening and how can I get it to only execute the function once?

854

asked Dec 16 '11 04:12

EMP

1 Answers

The following extension of your test code is informative:

CREATE OR REPLACE FUNCTION test_multi_calls1(one integer)
RETURNS integer
AS $BODY$
BEGIN
    RAISE NOTICE 'Immutable called with %', one;
    RETURN one;
END;
$BODY$ LANGUAGE plpgsql IMMUTABLE;
CREATE OR REPLACE FUNCTION test_multi_calls2(one integer)
RETURNS integer
AS $BODY$
BEGIN
    RAISE NOTICE 'Volatile called with %', one;
    RETURN one;
END;
$BODY$ LANGUAGE plpgsql VOLATILE;

WITH data AS
(
    SELECT 10 AS num
    UNION ALL SELECT 10
    UNION ALL SELECT 20
)
SELECT test_multi_calls1(num)
FROM data
where test_multi_calls2(40) = 40
and test_multi_calls1(30) = 30

OUTPUT:

NOTICE:  Immutable called with 30
NOTICE:  Volatile called with 40
NOTICE:  Immutable called with 10
NOTICE:  Volatile called with 40
NOTICE:  Immutable called with 10
NOTICE:  Volatile called with 40
NOTICE:  Immutable called with 20

Here we can see that while in the select-list the immutable function was called multiple times, in the where clause it was called once, while the volatile was called thrice.

The important thing isn't that PostgreSQL will only call a STABLE or IMMUTABLE function once with the same data - your example clearly shows that this is not the case - it's that it may call it only once. Or perhaps it will call it twice when it would have to call a volatile version 50 times, and so on.

There are different ways in which stability and immutability can be taken advantage of, with different costs and benefits. To provide the sort of saving you are suggesting it should make with select-lists it would have to cache the results, and then lookup each argument (or list of arguments) in this cache before either returning the cached result or calling function on a cache-miss. This would be more expensive than calling your function, even in the case where there was a high percentage of cache-hits (there could be 0% cache hits meaning this "optimisation" did extra work for absolutely no gain). It could store maybe just the last parameter and result, but again that could be completely useless.

This is especially so considering that stable and immutable functions are often the lightest functions.

With the where clause however, the immutability of test_multi_calls1 allows PostgreSQL to actually restructure the query from the plain meaning of the SQL given:

For every row calculate test_multi_calls1(30) and if the result is equal to 30 continue processing the row in question

To a different query plan entirely:

Calculate test_multi_calls1(30) and if it is equal to 30 then continue with the query otherwise return a zero row result-set without any further calculation

This is the sort of use that PostgreSQL makes of STABLE and IMMUTABLE - not the caching of results, but the rewriting of queries into different queries which are more efficient but give the same results.

Note also that test_multi_calls1(30) is called before test_multi_calls2(40) no matter what order they appear in the where clause. This means that if the first call results in no rows being returned (replace = 30 with = 31 to test) then the volatile function won't be called at all - again regardless to which is on which side of the and.

This particular sort of rewriting depends upon immutability or stability. With where test_multi_calls1(30) != num query re-writing will happen for immutable but not for merely stable functions. With where test_multi_calls1(num) != 30 it won't happen at all (multiple calls) though there are other optimisations possible:

Expressions containing only STABLE and IMMUTABLE functions can be used with index scans. Expressions containing VOLATILE functions cannot. The number of calls may or may not decrease, but much more importantly the results of the calls will then be used in a much more efficient way in the rest of the query (only really matters on large tables, but then it can make a massive difference).

In all, don't think of volatility categories in terms of memoisation, but rather in terms of giving PostgreSQL's query planner opportunities to restructure entire queries in ways that are logically equivalent (same results) but much more efficient.

answered Oct 07 '22 05:10

Jon Hanna

Related questions
                            
                                Efficient iteration over slice in Python
                            
                                Eager load associations with Active Model Serializers
                            
                                Haskell: Why does Int perform worse than Word64, and why is my program far slower than C?
                            
                                Size of store buffers on Intel hardware? What exactly is a store buffer?
                            
                                Performance Implications of Comments in SQL Stored Procedures
                            
                                For VS Foreach on Array performance (in AS3/Flex)
                            
                                Loading only part of a bitmap file in Android
                            
                                XNA game performance
                            
                                Why are the functions making Vector an instance of Functor, Monad, Applicative, Alternative, Foldable and Traversable slow?
                            
                                Is Boost/STL slow with regard to high performance computing? [closed]
                            
                                C# RichEditBox has extremely slow performance (4 minutes loading)
                            
                                SQL Efficiency: WHERE IN Subquery vs. JOIN then GROUP
                            
                                What are implications of SET-ting ALLOW_SNAPSHOT_ISOLATION ON?
                            
                                How to occupy 80% CPU consistently?
                            
                                Fast in-place replacement of some values in a numpy array
                            
                                matplotlib very slow. Is it normal?
                            
                                What does replace do if no match is found? (under the hood)
                            
                                Redis vs Kafka vs RabbitMQ for 1MB messages
                            
                                SQL Server - Synonyms Tips & Tricks? [closed]
                            
                                Can someone please explain what these ApacheBench results mean?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is PostgreSQL calling my STABLE/IMMUTABLE function multiple times?

Tags:

performance

function

postgresql

immutability

EMP

People also ask

1 Answers

Jon Hanna

Recent Activity

Donate For Us