Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Function executes faster without STRICT modifier?

I stumbled upon a slump in performance when a simple SQL function is declared STRICT while answering this question.

For demonstration, I created two variants of a function ordering two elements of an array in ascending order.

Test setup

Table with 10000 random pairs of integer (

CREATE TABLE tbl (arr int[]);

INSERT INTO tbl 
SELECT ARRAY[(random() * 1000)::int, (random() * 1000)::int]
FROM   generate_series(1,10000);

Function without STRICT modifier:

CREATE OR REPLACE FUNCTION f_sort_array(int[])
  RETURNS int[]
  LANGUAGE sql IMMUTABLE AS
$func$
SELECT CASE WHEN $1[1] > $1[2] THEN ARRAY[$1[2], $1[1]] ELSE $1 END;
$func$;

Function with STRICT modifier (otherwise identical):

CREATE OR REPLACE FUNCTION f_sort_array_strict(int[])
  RETURNS int[]
  LANGUAGE sql IMMUTABLE STRICT AS
$func$
SELECT CASE WHEN $1[1] > $1[2] THEN ARRAY[$1[2], $1[1]] ELSE $1 END;
$func$;

Results

I executed each around 20 times and took the best result from EXPLAIN ANALYZE.

SELECT f_sort_array(arr)        FROM tbl;  -- Total runtime:  43 ms
SELECT f_sort_array_strict(arr) FROM tbl;  -- Total runtime: 103 ms

These are the results from Postgres 9.0.5 on Debian Squeeze. Similar results on 8.4.

In a test with all NULL values both functions perform the same: ~37 ms.

I did some research and found an interesting gotcha. Declaring an SQL function STRICT disables function-inlining in most cases. More about that in the PostgreSQL Online Journal or in the pgsql-performance mailing list or in the Postgres Wiki.

But I am not quite sure how this could be the explanation. Not inlining the function causes a performance slump in this simple scenario? No index, no disc read, no sorting. Maybe an overhead from the repeated function call that is streamlined away by inlining the function?

Retests

Same test, same hardware, Postgres 9.1. Even bigger differences:

SELECT f_sort_array(arr)        FROM tbl;  -- Total runtime:  27 ms
SELECT f_sort_array_strict(arr) FROM tbl;  -- Total runtime: 107 ms

Same test, new hardware, Postgres 9.6. The gap is even bigger, yet:

SELECT f_sort_array(arr)        FROM tbl;  -- Total runtime:  10 ms
SELECT f_sort_array_strict(arr) FROM tbl;  -- Total runtime:  60 ms
like image 229
Erwin Brandstetter Avatar asked Dec 10 '11 07:12

Erwin Brandstetter


2 Answers

Maybe an overhead from the repeated function call that is streamlined away by inlining the function?

That's what I'd guess. You've got a very simple expression there. An actual function-call presumably involves stack setup, passing parameters etc.

The test below gives run-times of 5ms for inlined and 50ms for strict.

BEGIN;

CREATE SCHEMA f;

SET search_path = f;

CREATE FUNCTION f1(int) RETURNS int AS $$SELECT 1$$ LANGUAGE SQL;
CREATE FUNCTION f2(int) RETURNS int AS $$SELECT 1$$ LANGUAGE SQL STRICT;

\timing on
SELECT sum(f1(i)) FROM generate_series(1,10000) i;
SELECT sum(f2(i)) FROM generate_series(1,10000) i;
\timing off

ROLLBACK;
like image 177
Richard Huxton Avatar answered Oct 24 '22 10:10

Richard Huxton


It's about function inlining like suspected and confirmed by Richard's test.

To be clear, the Postgres Wiki lists this requirement for inlining of a scalar function (like my example):

  • if the function is declared STRICT, then the planner must be able to prove that the body expression necessarily returns NULL if any parameter is null. At present, this condition is only satisfied if: every parameter is referenced at least once, and all functions, operators and other constructs used in the body are themselves STRICT.

The example function obviously does not qualify. Both the CASE construct and the ARRAY constructor are to blame according to my tests.

Table functions (returning a set of rows) are more picky, yet:

  • the function is not declared STRICT

If the function cannot be inlined, repeated execution collects the function overhead repeatedly. The difference in performance got bigger in later Postgres versions.

Retest with PostgreSQL 13 on a current laptop. Bigger difference, yet:

SELECT f_sort_array(arr)        FROM tbl;  -- Total runtime:   4 ms
SELECT f_sort_array_strict(arr) FROM tbl;  -- Total runtime:  32 ms

Same test on dbfiddle.com, PostgreSQL 13. Bigger difference, yet:

SELECT f_sort_array(arr)        FROM tbl;  -- Total runtime:   4 ms
SELECT f_sort_tblay_strict(arr) FROM tbl;  -- Total runtime: 137 ms (!)

Comprehensive test including tests with half and all NULL values:

db<>fiddle here

like image 1
Erwin Brandstetter Avatar answered Oct 24 '22 11:10

Erwin Brandstetter