Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Join tables using a value inside a JSONB column

There are two tables:

Authorized Contacts (auth_contacts):

(
userid varchar
contacts jsonb
)

contacts contains an array of contacts with attributes {contact_id, type}

discussion:

(
contact_id varchar
discussion_id varchar
discussion_details jsonb
)

The table auth_contacts has at least 100k records making it non JSONB type is not appropriate according as it would double or triple the amount of records.

Sample data for auth_contacts:

userid  | contacts
'11111' | '{"contact": [{"type": "type_a", "contact_id": "1-A-12"}
                      , {"type": "type_b", "contact_id": "1-A-13"}]}'

discussion table has 5 million odd records.

I want to join on discussion.contact_id (relational column) with contact id which a json object inside array of json objects in auth_contacts.contacts.

One very crude way is:

SELECT *
FROM discussion d 
JOIN (SELECT userid, JSONB_OBJECT_KEYS(a.contacts) AS auth_contact
      FROM auth_contacts a) AS contacts
      ON (d.contact_id = contacts.auth_contact::text)

What this does is actually at runtime create (inner sql) userid vs contact id table (Which is what I was avoiding and hence went for JSONB data type This query for a user with large records takes 26 + seconds which is not all good. Tried a few other ways: PostgreSQL 9.4: Aggregate / Join table on JSON field id inside array

But there should be a cleaner and better way which would be as simple as JOIN d.contact_id = contacts -> contact -> contact_id? When I try this, it doesn't yield any results.

When searching the net this seems to be a pretty cumbersome task?

like image 804
Prachi Tripathi Avatar asked Jul 09 '15 06:07

Prachi Tripathi


People also ask

How do I query Jsonb data?

Querying the JSON documentPostgreSQL has two native operators -> and ->> to query JSON documents. The first operator -> returns a JSON object, while the operator ->> returns text. These operators work on both JSON as well as JSONB columns. There are additional operators available for JSONB columns.

Is Jsonb faster than JSON?

Json processes input faster than jsonb as there is no conversion involved in this. Jsonb converts the JSON data into the binary form so it has slightly slower input due to the binary conversion overhead. There is no change in the Schema design while working with JSON.

Should I use JSON or Jsonb?

In general, most applications should prefer to store JSON data as jsonb , unless there are quite specialized needs, such as legacy assumptions about ordering of object keys. RFC 7159 specifies that JSON strings should be encoded in UTF8.

Is Jsonb efficient?

Most applications should use JSONB for schemaless data. It stores parsed JSON in a binary format, so queries are efficient.


1 Answers

Proof of concept

Your "crude way" doesn't actually work. Here is another crude way that does:

SELECT *
FROM  auth_contacts a
    , jsonb_to_recordset(a.contacts->'contact') AS c(contact_id text)
JOIN  discussion d USING (contact_id);

As has been commented, you can also formulate a join condition with the contains operator @>:

SELECT *
FROM   auth_contacts a
JOIN   discussion d ON a.contacts->'contact'
                    @> json_build_array(json_build_object('contact_id', d.contact_id))::jsonb

But rather use JSON creation functions than string concatenation. Looks cumbersome but will actually be very fast if supported with a functional jsonb_path_ops GIN index:

CREATE INDEX auth_contacts_contacts_gin_idx ON auth_contacts
USING  gin ((contacts->'contact') jsonb_path_ops);

Details:

  • Index for finding an element in a JSON array
  • Postgres 9.4 jsonb array as table

Proper solution

This is all fascinating to play with, but the problem here is the relational model. Your claim:

hence making it non JSONB type is not appropriate according as it would double or triple the amount of records.

is the opposite of what's right. It's nonsense to wrap IDs you need for joining tables into a JSON document type. Normalize your table with a many-to-many relationship and implement all IDs you are working with inside the DB as separate columns with appropriate data type. Basics:

  • How to perform update operations on columns of type JSONB in Postgres 9.4
  • How to implement a many-to-many relationship in PostgreSQL?
like image 109
Erwin Brandstetter Avatar answered Sep 21 '22 00:09

Erwin Brandstetter