Have been working with PostgreSQL 12.8 (Ubuntu 12.8-1.pgdg18.04+1) for a while. We have a significant time difference between normal queries and the same queries wrapped inside PLPGSQL functions with the same inputs. The table schema and queries are described as follow:
CREATE TABLE public."AircraftData"
(
"uid" BIGINT NOT NULL,
"time" BIGINT NOT NULL,
"action" SMALLINT NOT NULL,
"position" GEOMETRY NOT NULL
)
1 secs 279 msec: SELECT
A.uid AS uid,
A.max_time AS time,
ST_AsText
(
ST_Collect
(
ARRAY
(
SELECT
"position"
FROM
public."AircraftData"
WHERE
"uid" = A.uid
AND
"time" BETWEEN 1617608651642 AND 1617657618453
ORDER BY
"time" DESC
LIMIT 254
)
)
) AS trajectory
FROM
(
SELECT
M.uid AS uid,
M.max_time AS max_time
FROM
(
SELECT
"uid",
MAX("time") AS max_time
FROM
public."AircraftData"
WHERE
"time" BETWEEN 1617608651642 AND 1617657618453
GROUP BY
"uid"
) M
INNER JOIN
(
SELECT
"uid"
FROM
public."AircraftData"
WHERE
"time" > 1617657618453
GROUP BY
"uid"
) N
ON M.uid = N.uid
) A
INNER JOIN
public."AircraftData" B
ON
A.uid = B.uid AND A.max_time = B.time AND B.action != 2
ORDER BY
"time" ASC
4 secs 90 msec: DROP FUNCTION IF EXISTS foo(BIGINT,BIGINT);
CREATE OR REPLACE FUNCTION foo(last_server_time BIGINT, start_time BIGINT)
RETURNS TABLE
(
"uid" BIGINT,
"time" BIGINT,
"trajectory" TEXT
)
AS $BODY$
BEGIN
RETURN QUERY
SELECT
A.uid AS uid,
A.max_time AS time,
ST_AsText
(
ST_Collect
(
ARRAY
(
SELECT
"position"
FROM
public."AircraftData"
WHERE
"uid" = A.uid
AND
"time" BETWEEN $1 AND $2
ORDER BY
"time" DESC
LIMIT 254
)
)
) AS trajectory
FROM
(
SELECT
M.uid AS uid,
M.max_time AS max_time
FROM
(
SELECT
"uid",
MAX("time") AS max_time
FROM
public."AircraftData"
WHERE
"time" BETWEEN $1 AND $2
GROUP BY
"uid"
) M
INNER JOIN
(
SELECT
"uid"
FROM
public."AircraftData"
WHERE
"time" > $2
GROUP BY
"uid"
) N
ON M.uid = N.uid
) A
INNER JOIN
public."AircraftData" B
ON
A.uid = B.uid AND A.max_time = B.time AND B.action != 2
ORDER BY
"time" ASC;
END;
$BODY$ LANGUAGE PLPGSQL;
4 secs 97 msec: DROP FUNCTION IF EXISTS foo(BIGINT,BIGINT);
CREATE OR REPLACE FUNCTION foo(last_server_time BIGINT, start_time BIGINT)
RETURNS TABLE
(
"uid" BIGINT,
"time" BIGINT,
"trajectory" TEXT
)
AS $BODY$
BEGIN
RETURN QUERY
EXECUTE
'SELECT
A.uid AS uid,
A.max_time AS time,
ST_AsText
(
ST_Collect
(
ARRAY
(
SELECT
"position"
FROM
public."AircraftData"
WHERE
"uid" = A.uid
AND
"time" BETWEEN $1 AND $2
ORDER BY
"time" DESC
LIMIT 254
)
)
) AS trajectory
FROM
(
SELECT
M.uid AS uid,
M.max_time AS max_time
FROM
(
SELECT
"uid",
MAX("time") AS max_time
FROM
public."AircraftData"
WHERE
"time" BETWEEN $1 AND $2
GROUP BY
"uid"
) M
INNER JOIN
(
SELECT
"uid"
FROM
public."AircraftData"
WHERE
"time" > $2
GROUP BY
"uid"
) N
ON M.uid = N.uid
) A
INNER JOIN
public."AircraftData" B
ON
A.uid = B.uid AND A.max_time = B.time AND B.action != 2
ORDER BY
"time" ASC'
USING last_server_time, start_time;
END;
$BODY$ LANGUAGE PLPGSQL;
2 min 28 secs: DROP FUNCTION IF EXISTS foo(BIGINT,BIGINT);
CREATE OR REPLACE FUNCTION foo(last_server_time BIGINT, start_time BIGINT)
RETURNS TABLE
(
"uid" BIGINT,
"time" BIGINT,
"trajectory" TEXT
)
AS $BODY$
SELECT
A.uid AS uid,
A.max_time AS time,
ST_AsText
(
ST_Collect
(
ARRAY
(
SELECT
"position"
FROM
public."AircraftData"
WHERE
"uid" = A.uid
AND
"time" BETWEEN $1 AND $2
ORDER BY
"time" DESC
LIMIT 254
)
)
) AS trajectory
FROM
(
SELECT
M.uid AS uid,
M.max_time AS max_time
FROM
(
SELECT
"uid",
MAX("time") AS max_time
FROM
public."AircraftData"
WHERE
"time" BETWEEN $1 AND $2
GROUP BY
"uid"
) M
INNER JOIN
(
SELECT
"uid"
FROM
public."AircraftData"
WHERE
"time" > $2
GROUP BY
"uid"
) N
ON M.uid = N.uid
) A
INNER JOIN
public."AircraftData" B
ON
A.uid = B.uid AND A.max_time = B.time AND B.action != 2
ORDER BY
"time" ASC;
$BODY$ LANGUAGE SQL;
So what's the reason for this function time overhead?
Is there any room for improvements?
Adding label STABLE to SQL function(4) change execution time to 4 secs 67 msec!
Sort (cost=719260.66..719260.69 rows=12 width=48) (actual time=1543.822..1543.930 rows=200 loops=1)
Sort Key: (max("AircraftData"."time"))
Sort Method: quicksort Memory: 1637kB
Buffers: shared hit=895329, temp read=398 written=401
-> Merge Join (cost=493439.89..719260.44 rows=12 width=48) (actual time=1267.917..1543.749 rows=200 loops=1)
Merge Cond: ("AircraftData".uid = "AircraftData_1".uid)
Buffers: shared hit=895329, temp read=398 written=401
-> Nested Loop (cost=91702.67..111690.17 rows=1 width=24) (actual time=272.476..298.257 rows=200 loops=1)
Buffers: shared hit=273686
-> Finalize GroupAggregate (cost=91702.23..92568.25 rows=2279 width=16) (actual time=272.460..274.158 rows=2100 loops=1)
Group Key: "AircraftData".uid
Buffers: shared hit=179393
-> Gather Merge (cost=91702.23..92511.27 rows=6837 width=16) (actual time=272.450..273.457 rows=6600 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=179393
-> Sort (cost=90702.19..90707.89 rows=2279 width=16) (actual time=249.231..249.334 rows=1650 loops=4)
Sort Key: "AircraftData".uid
Sort Method: quicksort Memory: 39kB
Worker 0: Sort Method: quicksort Memory: 195kB
Worker 1: Sort Method: quicksort Memory: 195kB
Worker 2: Sort Method: quicksort Memory: 195kB
Buffers: shared hit=179393
-> Partial HashAggregate (cost=90552.30..90575.09 rows=2279 width=16) (actual time=248.748..248.935 rows=1650 loops=4)
Group Key: "AircraftData".uid
Buffers: shared hit=179372
-> Parallel Index Only Scan using AircraftData_time_uid_index on "AircraftData" (cost=0.44..85716.66 rows=967127 width=16) (actual time=0.044..77.510 rows=760982 loops=4)
Index Cond: (("time" >= '1617608651642'::bigint) AND ("time" <= '1617657618453'::bigint))
Heap Fetches: 0
Buffers: shared hit=179372
-> Index Scan using AircraftData_time_index on "AircraftData" b (cost=0.44..8.37 rows=1 width=16) (actual time=0.011..0.011 rows=0 loops=2100)
Index Cond: ("time" = (max("AircraftData"."time")))
Filter: ((action <> 2) AND ("AircraftData".uid = uid))
Rows Removed by Filter: 50
Buffers: shared hit=94293
-> Group (cost=401737.23..402563.36 rows=2279 width=8) (actual time=954.077..988.991 rows=124598 loops=1)
Group Key: "AircraftData_1".uid
Buffers: shared hit=542041, temp read=398 written=401
-> Gather Merge (cost=401737.23..402546.27 rows=6837 width=8) (actual time=954.067..977.742 rows=178728 loops=1)
Workers Planned: 3
Workers Launched: 2
Buffers: shared hit=542041, temp read=398 written=401
-> Sort (cost=400737.19..400742.88 rows=2279 width=8) (actual time=944.755..949.248 rows=59650 loops=3)
Sort Key: "AircraftData_1".uid
Sort Method: external merge Disk: 1064kB
Worker 0: Sort Method: external merge Disk: 1056kB
Worker 1: Sort Method: external merge Disk: 1064kB
Buffers: shared hit=542041, temp read=398 written=401
-> Partial HashAggregate (cost=400587.30..400610.09 rows=2279 width=8) (actual time=922.174..930.673 rows=59687 loops=3)
Group Key: "AircraftData_1".uid
Buffers: shared hit=542027
-> Parallel Index Only Scan using AircraftData_time_uid_index on "AircraftData" "AircraftData_1" (cost=0.44..388576.19 rows=4804444 width=8) (actual time=0.023..451.211 rows=4960127 loops=3)
Index Cond: ("time" > '1617657618453'::bigint)
Heap Fetches: 0
Buffers: shared hit=542027
SubPlan 1
-> Limit (cost=0.44..17080.15 rows=254 width=48) (actual time=0.008..1.169 rows=254 loops=200)
Buffers: shared hit=79602
-> Index Scan Backward using AircraftData_time_uid_index on "AircraftData" "AircraftData_2" (cost=0.44..88492.18 rows=1316 width=48) (actual time=0.007..1.156 rows=254 loops=200)
Index Cond: (("time" >= '1617608651642'::bigint) AND ("time" <= '1617657618453'::bigint) AND (uid = "AircraftData".uid))
Buffers: shared hit=79602
Planning Time: 0.741 ms
JIT:
Functions: 64
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 7.142 ms, Inlining 163.028 ms, Optimization 205.236 ms, Emission 145.702 ms, Total 521.108 ms
Execution Time: 1549.206 ms
Sort (cost=871918.05..871918.08 rows=12 width=48) (actual time=5482.659..5482.683 rows=200 loops=1)
Sort Key: (max("AircraftData"."time"))
Sort Method: quicksort Memory: 1637kB
Buffers: shared hit=855647
-> Nested Loop (cost=647721.19..871917.84 rows=12 width=48) (actual time=3269.758..5482.342 rows=200 loops=1)
Join Filter: ("AircraftData".uid = "AircraftData_1".uid)
Rows Removed by Join Filter: 12419447
Buffers: shared hit=855647
-> Nested Loop (cost=121017.25..140161.52 rows=1 width=24) (actual time=734.056..774.883 rows=200 loops=1)
Buffers: shared hit=269116
-> HashAggregate (cost=121016.81..121039.60 rows=2279 width=16) (actual time=732.927..733.488 rows=2100 loops=1)
Group Key: "AircraftData".uid
Buffers: shared hit=174823
-> Index Only Scan using AircraftData_time_uid_index on "AircraftData" (cost=0.44..106026.34 rows=2998095 width=16) (actual time=0.047..279.640 rows=3043927 loops=1)
Index Cond: (("time" >= '1617608651642'::bigint) AND ("time" <= '1617657618453'::bigint))
Heap Fetches: 0
Buffers: shared hit=174823
-> Index Scan using AircraftData_time_index on "AircraftData" b (cost=0.44..8.37 rows=1 width=16) (actual time=0.019..0.019 rows=0 loops=2100)
Index Cond: ("time" = (max("AircraftData"."time")))
Filter: ((action <> 2) AND ("AircraftData".uid = uid))
Rows Removed by Filter: 50
Buffers: shared hit=94293
-> HashAggregate (cost=526703.94..526726.73 rows=2279 width=8) (actual time=12.619..20.103 rows=62098 loops=200)
Group Key: "AircraftData_1".uid
Buffers: shared hit=506929
-> Index Only Scan using AircraftData_time_uid_index on "AircraftData" "AircraftData_1" (cost=0.44..489469.50 rows=14893775 width=8) (actual time=0.010..1258.534 rows=14880381 loops=1)
Index Cond: ("time" > '1617657618453'::bigint)
Heap Fetches: 0
Buffers: shared hit=506929
SubPlan 1
-> Limit (cost=0.44..17080.15 rows=254 width=48) (actual time=0.014..1.280 rows=254 loops=200)
Buffers: shared hit=79602
-> Index Scan Backward using AircraftData_time_uid_index on "AircraftData" "AircraftData_2" (cost=0.44..88492.18 rows=1316 width=48) (actual time=0.013..1.265 rows=254 loops=200)
Index Cond: (("time" >= '1617608651642'::bigint) AND ("time" <= '1617657618453'::bigint) AND (uid = "AircraftData".uid))
Buffers: shared hit=79602
Planning Time: 1.638 ms
JIT:
Functions: 28
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 7.741 ms, Inlining 23.622 ms, Optimization 98.256 ms, Emission 56.727 ms, Total 186.345 ms
Execution Time: 5491.240 ms
Hard to say, you didn't disclose much about your setup. Probably several reasons. If your table is big (like your execution times suggest), a prime suspect for the PL/pgSQL function would be RETURN QUERY which does not allow parallelism. This restriction is removed with the upcoming PostgreSQL 14. The release notes:
Allow plpgsql's
RETURN QUERYto execute its query using parallelism (Tom Lane)
The dismal performance of the SQL function is a mystery. Maybe a generic query plan, while your plain SQL gets a tailored query plan due to the fact that it's planned with constant filters instead of parameters.
This shouldn't normally happen. But it can. See:
(Typically, there are other problems behind it ...)
Yes.
SELECT uid, time
, ST_AsText(ST_Collect(ARRAY(
SELECT "position"
FROM "AircraftData"
WHERE uid = A.uid
AND time BETWEEN 1617608651642 AND 1617657618453
ORDER BY time DESC
LIMIT 254))) AS trajectory
FROM (
SELECT DISTINCT ON (uid)
uid, time, action
FROM "AircraftData"
WHERE time BETWEEN 1617608651642 AND 1617657618453
ORDER BY uid, time DESC, action = 2 -- get action <> 2 if any
) A
WHERE action <> 2
AND EXISTS (
SELECT FROM "AircraftData"
WHERE time > 1617657618453
AND uid = A.uid
);
Equivalent, except for one difference: this query only ever returns a single trajectory per uid, where your original would return multiple if there is more than one row with the latest time and action <> 2. I assume, that's how it should be anyway (or there can never be more than one action <> 2 per design).
About DISTINCT ON:
Sorting by action = 2 works because false sorts before true (and null sorts last) in ascending order. So any peer with action <> 2 sorts first and is picked by DISTINCT ON - and that qualifies an uid in your query. See:
Also, to filter uid which also have a later entry, an EXISTS subquery is typically much cheaper than aggregating all rows after some timestamp etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With