Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a simple way on postgres to create deeply nested JSON structures without having to write very complex indented queries?

I'm joining several tables together in a postgres database, and returning the values in the right joined table as an aggregated JSON structure in the left joined table. However I find that that query becomes more complicated the more tables that are joined. For example:

select row_to_json(output)
from (
    select image_type.name,
    (
        select json_agg(instances)
        from (
            select image_instance.name, (
                select json_agg(versions)
                from (
                    select image_version.name
                    from image_version
                    where image_version.image_instance_id = image_version.image_instance_id
                ) versions
            ) AS versions
            from image_instance
            where image_instance.image_type_id = image_type.image_type_id
        ) instances
    ) AS images
    from image_type
) output;

I've joined three tables here, however I'd like to add several more tables to this, but the code will quickly become unwieldy and hard to maintain. Is there a simple way to generate these kinds of aggregated joins?

like image 516
Michael Barton Avatar asked Oct 19 '22 00:10

Michael Barton


1 Answers

First of all, JSON is no different than regular fields when combining data from multiple tables: things can get complex quite quickly. There are, however, a few techniques to keep things manageable:

1. Daisy chain functions

There is no need to treat the output from each function independently, you can feed the output from one function as input to the next in a single statement. In your example this means that you lose a level of sub-select for each level of aggregation and you can forget about the aliases. Your example becomes:

select row_to_json(row(image_type.name, (
       select json_agg(image_instance.name, (
              select json_agg(image_version.name)
              from image_version
              where image_version.image_instance_id = image_instance.id) -- join edited
       from image_instance
       where image_instance.image_type_id = image_type.image_type_id))))
from image_type;

2. Don't use scalar sub-queries

This may be a matter of personal taste, but scalar sub-queries tend to be difficult to read (and write: you had an obvious error in the join condition of your innermost scalar sub-query, just to illustrate my point). Use regular sub-queries with explicit joins and aggregations instead:

select row_to_json(row(it.name, iiv.name))
from image_type it
join (select image_type_id, json_agg(name, iv_name) as name
      from image_instance ii
      join (select image_instance_id, json_agg(name) as iv_name
            from image_version group by 1) iv on iv.image_instance_id = ii.id
      group by 1) iiv using (image_type_id);

3. Modularize

Right there at the beginning of the documentation, in the Tutorial section (highly recommended reading, however proficient you think you are):

Making liberal use of views is a key aspect of good SQL database design.

create view iv_json as
    select image_instance_id, json_agg(name) as iv_name
    from image_version
    group by 1;

create view ii_json as
    select image_type_id, json_agg(name, iv_name) as name
    from image_instance
    join iv_json on image_instance_id = image_instance.id
    group by 1;

Your main query now becomes:

select row_to_json(row(it.name, ii.name))
from image_type it
join ii_json ii using (image_type_id);

And so on...

This is obviously by far the easiest to code, test and maintain. Performance is a non-issue here: the query optimizer will flatten all the linked views into a single execution plan.

Final note: If you are using PG9.4+, you can use json_build_object() instead of row_to_json() for more intelligible output.

like image 117
Patrick Avatar answered Oct 21 '22 11:10

Patrick