The Postgres docs say
For best optimization results, you should label your functions with the strictest volatility category that is valid for them.
However, I seem to have an example where this is not the case, and I'd like to understand what's going on. (Background: I'm running postgres 9.2)
I often need to convert times expressed as integer numbers of seconds to dates. I've written a function to do this:
CREATE OR REPLACE FUNCTION
to_datestamp(time_int double precision) RETURNS date AS $$
SELECT date_trunc('day', to_timestamp($1))::date;
$$ LANGUAGE SQL;
Let's compare performance to otherwise identical functions, with volatility set to IMMUTABLE and to STABLE:
CREATE OR REPLACE FUNCTION
to_datestamp_immutable(time_int double precision) RETURNS date AS $$
SELECT date_trunc('day', to_timestamp($1))::date;
$$ LANGUAGE SQL IMMUTABLE;
CREATE OR REPLACE FUNCTION
to_datestamp_stable(time_int double precision) RETURNS date AS $$
SELECT date_trunc('day', to_timestamp($1))::date;
$$ LANGUAGE SQL STABLE;
To test this, I'll create a table of 10^6 random integers corresponding to times between 2010-01-01 and 2015-01-01
CREATE TEMPORARY TABLE random_times AS
SELECT 1262304000 + round(random() * 157766400) AS time_int
FROM generate_series(1, 1000000) x;
Finally, I'll time calling the two functions on this table; on my particular box, the original takes ~6 seconds, the immutable version takes ~33 seconds, and the stable version takes ~6 seconds.
EXPLAIN ANALYZE SELECT to_datestamp(time_int) FROM random_times;
Seq Scan on random_times (cost=0.00..20996.62 rows=946950 width=8)
(actual time=0.150..5493.722 rows=1000000 loops=1)
Total runtime: 6258.827 ms
EXPLAIN ANALYZE SELECT to_datestamp_immutable(time_int) FROM random_times;
Seq Scan on random_times (cost=0.00..250632.00 rows=946950 width=8)
(actual time=0.211..32209.964 rows=1000000 loops=1)
Total runtime: 33060.918 ms
EXPLAIN ANALYZE SELECT to_datestamp_stable(time_int) FROM random_times;
Seq Scan on random_times (cost=0.00..20996.62 rows=946950 width=8)
(actual time=0.086..5295.608 rows=1000000 loops=1)
Total runtime: 6063.498 ms
What's going on here? E.g., is postgres spending time caching results when that won't actually be helpful since the arguments passed to the function are unlikely to repeat?
(I'm running postgres 9.2.)
Thanks!
UPDATE
Thanks to Craig Ringer this has been discussed on the pgsql-performance mailing list. Highlights:
Tom Lane says
[ shrug... ] Using IMMUTABLE to lie about the mutability of a function (in this case, date_trunc) is a bad idea. It's likely to lead to wrong answers, never mind performance issues. In this particular case, I imagine the performance problem comes from having suppressed the option to inline the function body ... but you should be more worried about whether you aren't getting flat-out bogus answers in other cases.
Pavel Stehule says
If I understand, a used IMMUTABLE flag disables inlining. What you see, is SQL eval overflow. My rule is - don't use flags in SQL functions, when it is possible.
IMMUTABLE indicates that the function cannot modify the database and always returns the same result when given the same argument values; that is, it does not do database lookups or otherwise use information not directly present in its argument list.
A procedure is a database object similar to a function. The key differences are: Procedures are defined with the CREATE PROCEDURE command, not CREATE FUNCTION . Procedures do not return a function value; hence CREATE PROCEDURE lacks a RETURNS clause.
PostgreSQL returns a table with one column that holds the array of films. In practice, you often process each individual row before appending it in the function's result set. The following example illustrates the idea.
A prepared statement is a server-side object that can be used to optimize performance. When the PREPARE statement is executed, the specified statement is parsed, analyzed, and rewritten. When an EXECUTE command is subsequently issued, the prepared statement is planned and executed.
The problem is that to_timestamp
returns timestamp with time zone. If the to_timestamp
function is replaced with a "manual" calculation without time zone there is no difference in performance
create or replace function to_datestamp_stable(
time_int double precision
) returns date as $$
select date_trunc('day', timestamp 'epoch' + $1 * interval '1 second')::date;
$$ language sql stable;
explain analyze
select to_datestamp_stable(a)
from generate_series(1, 1000000) s (a);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Function Scan on generate_series s (cost=0.00..22.50 rows=1000 width=4) (actual time=96.962..433.562 rows=1000000 loops=1)
Total runtime: 459.531 ms
create or replace function to_datestamp_immutable(
time_int double precision
) returns date as $$
select date_trunc('day', timestamp 'epoch' + $1 * interval '1 second')::date;
$$ language sql immutable;
explain analyze
select to_datestamp_immutable(a)
from generate_series(1, 1000000) s (a);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Function Scan on generate_series s (cost=0.00..22.50 rows=1000 width=4) (actual time=94.188..433.492 rows=1000000 loops=1)
Total runtime: 459.434 ms
Same functions using to_timestamp
create or replace function to_datestamp_stable(
time_int double precision
) returns date as $$
select date_trunc('day', to_timestamp($1))::date;
$$ language sql stable;
explain analyze
select to_datestamp_stable(a)
from generate_series(1, 1000000) s (a);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Function Scan on generate_series s (cost=0.00..20.00 rows=1000 width=4) (actual time=91.924..3059.570 rows=1000000 loops=1)
Total runtime: 3103.655 ms
create or replace function to_datestamp_immutable(
time_int double precision
) returns date as $$
select date_trunc('day', to_timestamp($1))::date;
$$ language sql immutable;
explain analyze
select to_datestamp_immutable(a)
from generate_series(1, 1000000) s (a);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Function Scan on generate_series s (cost=0.00..262.50 rows=1000 width=4) (actual time=92.639..20083.920 rows=1000000 loops=1)
Total runtime: 20149.311 ms
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With