Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Index on JSON field with dynamic keys

I'm on PG 9.5 and I have a table Visitors(id, data::json)

Example:

Visitor(id: 1, data: {name: 'Jack', age: 33, is_user: true })

I'd like to perform queries like

  • Give me all visitors named Jack and age > 25
  • Give me all visitors who are users, but where name is unspecified (key not in json)

The keys inside the data column user-specified and as such are dynamic.

Which index makes the most sense in this situation?

like image 863
Tarlen Avatar asked Oct 20 '16 15:10

Tarlen


People also ask

Can you index a JSON object?

You can index JSON data as you would any data of the type that you use to store it. In particular, you can use a B-tree index or a bitmap index for SQL/JSON function json_value , and you can use a bitmap index for SQL/JSON conditions is json , is not json , and json_exists .

What is Jsonb_path_ops?

Of the two operator classes for type jsonb , jsonb_ops is the default. jsonb_path_ops supports fewer operators but offers better performance for those operators.

What is gin index in postgresql?

GIN stands for Generalized Inverted Index. GIN is designed for handling cases where the items to be indexed are composite values, and the queries to be handled by the index need to search for element values that appear within the composite items.


2 Answers

You can use a GIN index on a jsonb column, which gives you generalized, dynamic indexing of keys and values inside JSON value.

CREATE TABLE visitors (
  id integer,
  data jsonb
);

CREATE INDEX idx_visitors_data ON cards USING GIN (data);

SELECT * FROM visitors
WHERE data -> 'is_user' AND NOT data ? 'name';

Unfortunately, GIN indexes don't support numeric range comparisons. So while you could still issue a query for visitors named Jack aged over 25:

SELECT * FROM visitors
WHERE data @> '{"name": "Jack"}' AND ((data ->> 'age')::integer) > 25;

This will only use the index to find the name "Jack", and possibly to find rows which have an "age" key, but the actual test that the ages are over 25 will be done as a scan over the matching rows.

Note that if you really need range comparisons, you can still add non-GIN indexes on specific paths inside the JSON value, if you expect them to appear often enough to make that worthwhile. For example, you could add an index on data -> 'age' that supports range comparisons:

CREATE INDEX idx_visitors_data_age ON visitors ( ((data ->> 'age')::integer) );

(note the extra parentheses; you'll get an error without them).

See this excellent blog post for further information.

like image 74
Avish Avatar answered Sep 21 '22 12:09

Avish


You can look at additional extension JsQuery – is a language to query jsonb data type, it provides an additional functionality to jsonb (currently missing in PostgreSQL), such as a simple and effective way to search in nested objects and arrays, more comparison operators with indexes support. Read more here: https://github.com/postgrespro/jsquery.

In your cases, you can create jsonb_path_value_ops index:

CREATE INDEX idx_visitors ON visitors USING GIN (jsonb jsonb_path_value_ops);

and use the next queries:

select * from visitors where jsonb @@ 'name = "Jack" and age > 25';
select * from visitors where jsonb @@ 'not name = * and is_user=true';
like image 38
Leonard Avatar answered Sep 18 '22 12:09

Leonard