Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pattern matching on jsonb key/value

I am using PostgreSQL 9.4. My table has a jsonb column:

CREATE TABLE "PreStage".transaction (
  transaction_id serial NOT NULL,
  transaction jsonb
  CONSTRAINT pk_transaction PRIMARY KEY (transaction_id)
);

CREATE INDEX idxgin ON "PreStage".transaction USING gin (transaction);

I store transactions in terms of key / value in the JSONB column. One of the requirements is to search customer name from the key value, hence I am running a query like:

SELECT transaction as data FROM "PreStage".transaction
WHERE  transaction->>('HCP_FST_NM') ilike ('%neer%');

What ever I do seems the query doesn't like the GIN index. How can I make the query use a GIN index with case insensitive pattern search?

I tried changing jsonb column to text, indexing it using gin_trgm_ops then search for required text, then converting the result to json and then searching in the required key/value. This approach doesn't seem to work.

like image 585
Neeraj Gupta Avatar asked Feb 08 '23 21:02

Neeraj Gupta


1 Answers

The default GIN index operator class jsonb_ops does not allow full-text pattern matching on a value. Details:

  • What's the proper index for querying structures in arrays in Postgres jsonb?

The best indexing strategy depends on your complete situation. There are many options. To just cover the one key you presented, you could use a functional trigram index. You already tested gin_trgm_ops, so you are already familiar with the additional module pg_trgm. For those who are not:

  • PostgreSQL LIKE query performance variations

Once the module is installed:

CREATE INDEX idxgin ON "PreStage".transaction
USING gin ((transaction->>'HCP_FST_NM') gin_trgm_ops);

Then this query is supported:

SELECT transaction AS data
FROM   "PreStage".transaction
WHERE  transaction->>'HCP_FST_NM' ILIKE '%neer%';

I also removed some unnecessary parentheses.

Depending on unknown details, there are various options to optimize index coverage.

For instance, if many rows don't have a key 'HCP_FST_NM' at all, make that a partial index to exclude irrelevant rows and keep the index small:

CREATE INDEX idxgin ON "PreStage".transaction
USING gin ((transaction->>'HCP_FST_NM') gin_trgm_ops)
WHERE transaction ? 'HCP_FST_NM';

? being the jsonb containment operator.
And add the same predicate to every query that's supposed to use this index:

SELECT transaction AS data
FROM   "PreStage".transaction
WHERE  transaction->>'HCP_FST_NM' ILIKE '%neer%'
AND    transaction ? 'HCP_FST_NM';  -- even if that seems redundant.
like image 123
Erwin Brandstetter Avatar answered Feb 12 '23 09:02

Erwin Brandstetter