Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unnesting a list of JSON objects in PostgreSQL

I have a TEXT column in my PostgreSQL (9.6) database containing a list of one or more dictionnaries, like those ones.

[{"line_total_excl_vat": "583.3300", "account": "", "subtitle": "", "product_id": 5532548, "price_per_unit": "583.3333", "line_total_incl_vat": "700.0000", "text": "PROD0008", "amount": "1.0000", "vat_rate": "20"}]

or

[{"line_total_excl_vat": "500.0000", "account": "", "subtitle": "", "product_id": "", "price_per_unit": "250.0000", "line_total_incl_vat": "600.0000", "text": "PROD003", "amount": "2.0000", "vat_rate": "20"}, {"line_total_excl_vat": "250.0000", "account": "", "subtitle": "", "product_id": 5532632, "price_per_unit": "250.0000", "line_total_incl_vat": "300.0000", "text": "PROD005", "amount": "1.0000", "vat_rate": "20"}]

I would like to retrieve each dictionnary from the column and parse them in different columns.

For this example:

id | customer | blurb
---+----------+------
 1 | Joe      | [{"line_total_excl_vat": "583.3300", "account": "", "subtitle": "", "product_id": 5532548, "price_per_unit": "583.3333", "line_total_incl_vat": "700.0000", "text": "PROD0008", "amount": "1.0000", "vat_rate": "20"}]
 2 | Sally    | [{"line_total_excl_vat": "500.0000", "account": "", "subtitle": "", "product_id": "", "price_per_unit": "250.0000", "line_total_incl_vat": "600.0000", "text": "PROD003", "amount": "2.0000", "vat_rate": "20"}, {"line_total_excl_vat": "250.0000", "account": "", "subtitle": "", "product_id": 5532632, "price_per_unit": "250.0000", "line_total_incl_vat": "300.0000", "text": "PROD005", "amount": "1.0000", "vat_rate": "20"}]

would become:

id | customer | line_total_excl_vat  | account |  product_id | ...
---+----------+----------------------+---------+------------
 1 | Joe      | 583.3300             |     null|      5532548  
 2 | Sally    | 500.0000             |     null|         null
 3 | Sally    | 250.0000             |     null|      5532632
like image 812
user2475110 Avatar asked Dec 23 '22 06:12

user2475110


1 Answers

if you know beforehand what fields you want to extract, cast the text to json / jsonb & use json_to_recordset / jsonb_to_recordset. Note that this method requires the fields names / types to be explicitly be specified. Unspecified fields that are in the json dictionaries will not be extracted.

See official postgesql documentation on json-functions

self contained example:

WITH tbl (id, customer, dat) as ( values
     (1, 'Joe',
      '[{ "line_total_excl_vat": "583.3300"
        , "account": ""
        , "subtitle": ""
        , "product_id": 5532548
        , "price_per_unit": "583.3333"
        , "line_total_incl_vat": "700.0000"
        , "text": "PROD0008"
        , "amount": "1.0000"
        , "vat_rate": "20"}]')
    ,(2, 'Sally', 
      '[{ "line_total_excl_vat": "500.0000"
        , "account": ""
        , "subtitle": ""
        , "product_id": ""
        , "price_per_unit": "250.0000"
        , "line_total_incl_vat": "600.0000"
        , "text": "PROD003"
        , "amount": "2.0000"
        , "vat_rate": "20"}
      , { "line_total_excl_vat": "250.0000"
        , "account": ""
        , "subtitle": ""
        , "product_id": 5532632
        , "price_per_unit": "250.0000"
        , "line_total_incl_vat": "300.0000"
        , "text": "PROD005"
        , "amount": "1.0000"
        , "vat_rate": "20"}]')
)
SELECT id, customer, x.*
FROM tbl
   , json_to_recordset(dat::json) x 
        (  line_total_excl_vat numeric
         , acount text
         , subtitle text
         , product_id text
         , price_per_unit numeric
         , line_total_incl_vat numeric
         , "text" text
         , amount numeric
         , vat_rate numeric
        )

produces the following output:

id    customer    line_total_excl_vat    acount    subtitle    product_id    price_per_unit    line_total_incl_vat    text      amount    vat_rate
 1    Joe                      583.33                             5532548    583.3333                          700    PROD0008       1          20
 2    Sally                       500                                             250                          600    PROD003        2          20
 2    Sally                       250                             5532632         250                          300    PROD005        1          20

This format is often referred to as the wide format.

It is also possible to extract the data in a long format, which has the additional benefit that it keeps all the data without explicitly mentioning the field names. In this case, the query may be written as (the test data is elided for brevity)

SELECT id, customer, y.key, y.value, x.record_number
FROM tbl
   , lateral json_array_elements(dat::json) WITH ORDINALITY AS x (val, record_number)
   , lateral json_each_text(x.val) y

The with ordinality in the above statement adds a sequence number for each element in the unnested array, and is be used to disambiguate fields from different arrays for each customer.

This produced the output:

id  customer key                    value     record_number
1   Joe      line_total_excl_vat    583.3300  1
1   Joe      account                          1
1   Joe      subtitle                         1
1   Joe      product_id             5532548   1
1   Joe      price_per_unit         583.3333  1
1   Joe      line_total_incl_vat    700.0000  1
1   Joe      text                   PROD0008  1
1   Joe      amount                 1.0000    1
1   Joe      vat_rate               20        1
2   Sally    line_total_excl_vat    500.0000  1
2   Sally    account                          1
2   Sally    subtitle                         1
2   Sally    product_id                       1
2   Sally    price_per_unit         250.0000  1
2   Sally    line_total_incl_vat    600.0000  1
2   Sally    text                   PROD003   1
2   Sally    amount                 2.0000    1
2   Sally    vat_rate               20        1
2   Sally    line_total_excl_vat    250.0000  2
2   Sally    account                          2
2   Sally    subtitle                         2
2   Sally    product_id             5532632   2
2   Sally    price_per_unit         250.0000  2
2   Sally    line_total_incl_vat    300.0000  2
2   Sally    text                   PROD005   2
2   Sally    amount                 1.0000    2
2   Sally    vat_rate               20        2
like image 144
Haleemur Ali Avatar answered Jan 04 '23 18:01

Haleemur Ali