Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Query optimization when using a JSON field

Tags:

sql

postgresql

Running a PostgreSQL 9.6.4 on my laptop, I have a table called node which has a primary key id field and a properties::jsonb field.

I have a GIN index setup on the properties field.

When I run this query:

SELECT   n.*
FROM     node n
WHERE    node_type_id = '2'
AND      properties @> '{"slug":"wild-castles"}'::JSONB
ORDER BY n.id ASC OFFSET 0 LIMIT 10;

on ~5M rows table it takes about 20 seconds to get an answer. Looking into the explain plan I found out the query optimizer is first sorting the table by the primary key and then filtering by the properties field:

Limit  (cost=0.56..1517.94 rows=10 width=154)
  ->  Index Scan using node_pkey on node n  (cost=0.56..739571.11 rows=4874 width=154)
        Filter: ((properties @> '{"slug": "wild-castles"}'::jsonb) AND ((node_type_id)::text = '2'::text))

But when I remove the ordering I'm seeing the optimizer using the index as expected:

SELECT n.*
FROM   node n
WHERE  node_type_id = '2'
AND    properties @> '{"slug":"wild-castles"}'::JSONB
OFFSET 0 LIMIT 10;

Limit  (cost=93.77..127.10 rows=10 width=154)
  ->  Bitmap Heap Scan on node n  (cost=93.77..16338.56 rows=4874 width=154)
        Recheck Cond: (properties @> '{"slug": "wild-castles"}'::jsonb)
        Filter: ((node_type_id)::text = '2'::text)
        ->  Bitmap Index Scan on node_ix02  (cost=0.00..92.55 rows=4874 width=0)
              Index Cond: (properties @> '{"slug": "wild-castles"}'::jsonb)

Also, a simple WHERE properties @> '{"slug":"wild-castles"}'::JSONB behaves as expected:

EXPLAIN SELECT   n.*
FROM     node n
WHERE    properties @> '{"slug":"wild-castles"}'::JSONB
;

Bitmap Heap Scan on node n  (cost=93.77..16326.38 rows=4874 width=154)
  Recheck Cond: (properties @> '{"slug": "wild-castles"}'::jsonb)
  ->  Bitmap Index Scan on node_ix02  (cost=0.00..92.55 rows=4874 width=0)
        Index Cond: (properties @> '{"slug": "wild-castles"}'::jsonb)

So I guess I'm wondering why would the optimizer not use the index to filter out the rows first and then order them by the id field?

like image 565
acohen Avatar asked Sep 10 '17 21:09

acohen


2 Answers

Change Planner Method Configuration and force planer not to do seqscan

eg

      SET enable_seqscan = OFF;

       SELECT   n.*
        FROM     node n
               WHERE    node_type_id = '2'
               AND      properties @> '{"slug":"wild-castles"}'::JSONB
             ORDER BY n.id ASC OFFSET 0 LIMIT 10;
like image 163
sandeep rawat Avatar answered Oct 14 '22 16:10

sandeep rawat


In my experience, you sometimes have to trick the query planner to get it to perform well, and it takes some tweaking and fiddling with...

I would try running this to see how it performs:

SELECT nn.* FROM (
    SELECT n.*
    FROM   node n
    WHERE  node_type_id = '2'
    AND    properties @> '{"slug":"wild-castles"}'::JSONB
) nn
ORDER BY nn.id ASC OFFSET 0 LIMIT 10;
like image 32
Tyler Avatar answered Oct 14 '22 14:10

Tyler