Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between ANY(ARRAY[..]) vs ANY(VALUES (), () ..) in PostgreSQL

I am trying to workout query optimisation on id. Not sure which one way should I use. Below is the query plan using explain and cost wise looks similar.

1. explain (analyze, buffers) SELECT * FROM table1 WHERE id = ANY (ARRAY['00e289b0-1ac8-451f-957f-e00bc289148e'::uuid,...]);

QUERY PLAN:
    Index Scan using table1_pkey on table1  (cost=0.42..641.44 rows=76 width=835) (actual time=0.258..2.603 rows=76 loops=1)
    Index Cond: (id = ANY ('{00e289b0-1ac8-451f-957f-e00bc289148e,...}'::uuid[]))
    Buffers: shared hit=231 read=73
    Planning Time: 0.487 ms
    Execution Time: 2.715 ms)


2. explain (analyze, buffers) SELECT * FROM table1 WHERE id = ANY (VALUES ('00e289b0-1ac8-451f-957f-e00bc289148e'::uuid),...);

QUERY PLAN:
Nested Loop  (cost=1.56..644.10 rows=76 width=835) (actual time=0.058..0.297 rows=76 loops=1)
   Buffers: shared hit=304
   ->  HashAggregate  (cost=1.14..1.90 rows=76 width=16) (actual time=0.049..0.060 rows=76 loops=1)
         Group Key: "*VALUES*".column1
         ->  Values Scan on "*VALUES*"  (cost=0.00..0.95 rows=76 width=16) (actual time=0.006..0.022 rows=76 loops=1)
   ->  Index Scan using table1_pkey on table1  (cost=0.42..8.44 rows=1 width=835) (actual time=0.002..0.003 rows=1 loops=76)
         Index Cond: (id = "*VALUES*".column1)
         Buffers: shared hit=304
Planning Time: 0.437 ms
Execution Time: 0.389 ms

Looks like VALUES () does some hashing and join to improve performance but not sure.

NOTE: In my practical use case, id is uuid_generate_v4() e.x. d31cddc0-1771-4de8-ad41-e6c568b39a5d but the column may not be indexed as such. Also, I have a table of with 5-10 million records. Which way is for the better query performance?

like image 218
impossible Avatar asked Apr 16 '26 11:04

impossible


1 Answers

When working with a large number of elements in an Array, PostgreSQL uses a bitmap heap scan. To leverage the primary key index instead, provide the values directly rather than using an Array.

For more information read this article:https://www.datadoghq.com/blog/100x-faster-postgres-performance-by-changing-1-line/

like image 98
Advaith s Avatar answered Apr 18 '26 05:04

Advaith s