Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL Index on JSON

Using Postgres 9.4, I want to create an index on a json column that will be used when searching on specific keys within the column.

For example I have an 'farm' table with a json column 'animals'.

The animals column has json objects of the general format:

'{"cow": 2, "chicken": 11, "horse": 3}' 

I have tried a number of indexes (separately):

  1. create INDEX animal_index ON farm ((animal ->> 'cow'));
  2. create INDEX animal_index ON farm using gin ((animal ->> 'cow'));
  3. create INDEX animal_index ON farm using gist ((animal ->> 'cow'));

I want to run queries like:

SELECT * FROM farm WHERE (animal ->> 'cow') > 3; 

and have that query use the index.

When I run this query:

SELECT * FROM farm WHERE (animal ->> 'cow') is null; 

then the (1) index works, but I can't get any of the indexes to work for the inequality.

Is such an index possible?

The farm table contains only ~5000 farms, but some of them contain 100s of animals and the queries simply take too long for my use case. An index like this is the only method I can think of for speeding this query up, but perhaps there is another option.

like image 793
lnhubbell Avatar asked Mar 18 '16 04:03

lnhubbell


People also ask

How does Postgres index Jsonb?

When we use ->> operator of JSONB, PostgreSQL can use B-tree or Hash index for processing the operations. ->> operator returns the value of the specified attribute in text format. PostgreSQL can use indexes for the text results as compare operands.

Can Postgres handle JSON?

PostgreSQL is a Relational Database Management System (RDBMS) developed by the PostgreSQL Global Development Group. It has been in use for over 20 years and supports both SQL and JSON for relational and non-relational queries in order to provide flexibility and SQL compliance.

How do I query Jsonb data in PostgreSQL?

Querying the JSON documentPostgreSQL has two native operators -> and ->> to query JSON documents. The first operator -> returns a JSON object, while the operator ->> returns text. These operators work on both JSON as well as JSONB columns. There are additional operators available for JSONB columns.

Is Postgres Jsonb fast?

Significantly faster to process. Supports indexing (which can be a significant advantage, as we'll see later). simpler schema designs (replacing entity-attribute-value (EAV) tables with JSONB columns, which can be queried, indexed, and joined, allowing for performance improvements up until 1000X!)


Video Answer


1 Answers

Your other two indexes won't work simply because the ->> operator returns text, while you obviously have the jsonb gin operator classes in mind. Note that you only mention json, but you actually need jsonb for advanced indexing capabilities.

To work out the best indexing strategy, you'd have to define more closely which queries to cover. Are you only interested in cows? Or all animals / all tags? Which operators are possible? Does your JSON document also include non-animal keys? What to do with those? Do you want to include rows in the index where cows (or whatever) don't show up in the JSON document at all?

Assuming:

  • We are only interested in cows at the first level of nesting.
  • The value is always a valid integer.
  • We are not interested in rows without cows.

I suggest a functional btree index, much like you already have, but cast the value to integer. I don't suppose you'd want the comparison evaluated as text (where '2' is greater than '1111').

CREATE INDEX animal_index ON farm (((animal ->> 'cow')::int));  -- ! 

The extra set of parentheses is required for the cast shorthand to make the syntax for the index expression unambiguous.

Use the same expression in your queries to make Postgres realize the index is applicable:

SELECT * FROM farm WHERE (animal ->> 'cow')::int > 3; 

If you need a more generic jsonb index, consider:

  • What's the proper index for querying structures in arrays in Postgres jsonb?

For a known, static, trivial number of animals (like you commented), I suggest partial indexes like:

CREATE INDEX animal_index ON farm (((animal ->> 'cow')::int)) WHERE (animal ->> 'cow') IS NOT NULL;  CREATE INDEX animal_index ON farm (((animal ->> 'chicken')::int)) WHERE (animal ->> 'chicken') IS NOT NULL; 

Etc.

You may have to add the index condition to the query:

SELECT * FROM farm WHERE (animal ->> 'cow')::int > 3 AND   (animal ->> 'cow') IS NOT NULL;  

May seem redundant, but may be necessary. Test with ANALYZE!

like image 129
Erwin Brandstetter Avatar answered Sep 22 '22 02:09

Erwin Brandstetter