Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

snowflake json lateral subquery

I have the following in snowflake:

create or replace table json_tmp as select column1 as id, parse_json(column2) as c
    from VALUES (1,
                 '{"id": "0x1",
                   "custom_vars": [
                      { "key": "a", "value": "foo" },
                      { "key": "b", "value": "bar" }
                   ] }') v;

Based on the FLATTEN docs, I hoped to turn these into a table looking like this:

+-------+---------+-----+-----+
| db_id | json_id |  a  |  b  |
+-------+---------+-----+-----+
+-------+---------+-----+-----+
|   1   |   0x1   | foo | bar |
+-------+---------+-----+-----+

Here is the query I tried; it resulted in a SQL compilation error: "Object 'CUSTOM_VARS' does not exist."

select json_tmp.id as dbid,
    f.value:id as json_id,
    a.v,
    b.v
from json_tmp,
    lateral flatten(input => json_tmp.c) as f,
    lateral flatten(input => f.value:custom_vars) as custom_vars,
    lateral (select value:value as v from custom_vars where value:key = 'a') as a,
    lateral (select value:value as v from custom_vars where value:key = 'b') as b;

What exactly is the error here? Is there a better way to do this transformation?

like image 342
jsharp Avatar asked Jun 30 '26 00:06

jsharp


1 Answers

Note - your solution doesn't actually perform any joins - flatten is a "streaming" operation, it "explodes" the input, and then selects the rows it wants. If you only have 2 attributes in the data, it should be reasonably fast. However, if not, it can lead to an unnecessary data explosion (e.g. if you have 1000s of attributes).

The fastest solution depends on how your data is structured exactly, and what you can assume about the input. For example, if you know that 'a' and 'b' are always in that order, you can obviously use

select 
    id as db_id, 
    c:id, 
    c:custom_vars[0].value, 
    c:custom_vars[1].value 
from json_tmp;

If you know that custom_vars is always 2 elements, but the order is not known, you could do e.g.

select 
    id as db_id, 
    c:id, 
    iff(c:custom_vars[0].key = 'a', c:custom_vars[0].value, c:custom_vars[1].value), 
    iff(c:custom_vars[0].key = 'b', c:custom_vars[0].value, c:custom_vars[1].value) 
from json_tmp;

If the size of custom_vars is unknown, you could create a JavaScript function like extract_key(custom_vars, key) that would iterate over custom_vars and return value for the found key (or e.g. null or <empty_string> if not found).

Hope this helps. If not, please provide more details about your problem (data, etc).

like image 138
Marcin Zukowski Avatar answered Jul 02 '26 14:07

Marcin Zukowski



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!