Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres JSONB - join between two JSON fields

Using Postgres 9.5, how can I query a table with a JSONB column, joining two JSON fields, crucially utilising a GIN index to maximise performance?

I'm prototyping a system which can accommodate different schemas/data models for different customers on a single platform. Entity-Attribute-Value (EAV) is a common pattern for this and I'm keen to benchmark how well Postgres can handle queries on entities stored using its JSONB data type.

I have a single table which stores entities of different type e.g. customer, sales transaction.

CREATE TABLE entity_jsonb (
    id uuid PRIMARY KEY,
    type text,
    created_at timestamp without time zone,
    properties jsonb,
    org_id integer
);
CREATE INDEX entity_jsonb_created_at_idx ON entity_jsonb USING btree (created_at);
CREATE INDEX entity_jsonb_org_id_idx ON entity_jsonb USING btree (org_id);
CREATE INDEX entity_jsonb_type_idx ON entity_jsonb USING btree (type);
CREATE INDEX entity_jsonb_properties_idx ON entity_jsonb USING gin (properties);

I have a customer record in this table with type 'customer' and properties:

{"uuid": "8f8896c7-f41c-49f7-ad6e-4613f7b51a23", "email": "[email protected]", "lastname": "McCarthy", "createdAt": "May 27, 2015 12:06:45 PM", "firstname": "Nathan"}

I have a sales transaction record in this table with type 'sales_transaction' and properties:

{"uuid": "54243d48-e69f-4bb6-ab33-6defb8a0f626", "amount": 99817, "status": 0, "paymentType": 1, "currencyCode": "USD", "customerUuid": "8f8896c7-f41c-49f7-ad6e-4613f7b51a23", "transactionId": "471170"}

How can I take advantage of the GIN index (operators: @>, ?, ?&, ?|) to query transactions efficiently, but joining with customers to return data from both e.g. populating a list of transactions from the last 7 days, showing customer name as well as transaction amount?

I've tried this query to find a subset of transaction objects (for amount = 99817) and related customer details:

SELECT t.properties AS transaction, c.properties AS customer 
FROM entity_jsonb t 
JOIN entity_jsonb c ON (c.properties->>'uuid' = t.properties->>'customerUuid' AND c.type = 'customer') 
WHERE t.type = 'sales_transaction' AND t.properties @> '{"amount" : 99817}';

It works but it's not using the GIN index i.e. not as fast as I'd like it to be. I tried this approach (using the GIN @> containment operator) but it doesn't include the customer details. What am I doing wrong?

SELECT t.properties AS transaction, c.properties AS customer FROM entity_jsonb t LEFT JOIN entity_jsonb c ON (c.properties @> json_build_array(json_build_object('uuid', t.properties->'customerUuid'))::jsonb AND c.type = 'customer') WHERE t.type = 'sales_transaction' AND t.properties @> '{"amount" : 99817}';

I understand that this is not an optimum solution compared with a traditional relational design but I'm interested to see how efficient the query can be purely storing the entity data as JSON.

like image 763
siwatson Avatar asked Apr 21 '26 05:04

siwatson


1 Answers

My query wasn't far off. I didn't need json_build_array. The query now runs 10x faster than the version which does't use the GIN index, for a dataset of 100k customers:

SELECT t.properties AS transaction, c.properties AS customer 
FROM entity_jsonb t 
JOIN entity_jsonb c ON (c.properties @> json_build_object('uuid', t.properties->'customerUuid')::jsonb AND c.type = 'customer') 
WHERE t.type = 'sales_transaction' AND t.properties @> '{"amount" : 99817}';
like image 50
siwatson Avatar answered Apr 24 '26 21:04

siwatson



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!