I have JSON documents stored in Postgres under the JSON data type (Postgres 9.3) and I need to recursively collect the key names down the tree.
For example, given this JSON tree
{
"files": {
"folder": {
"file1": {
"property": "blah"
},
"file2": {
"property": "blah"
},
"file3": {
"property": "blah"
},
"file4": {
"property": "blah"
}
}
},
"software": {
"apt": {
"package1": {
"version": 1.2
},
"package2": {
"version": 1.2
},
"package3": {
"version": 1.2
},
"package4": {
"version": 1.2
}
}
}
}
I would like to extract something like [file1,file2,file3,file3,package1,package2,package3,package4]
Basically just a listing of keys that I can use for a text search index.
I know I can get a listing of keys on the outer most objects using something like
SELECT DISTINCT(json_object_keys(data))
And I know it's possible to to recursively climb through the tree using something like
WITH RECURSIVE data()
but i'm having trouble putting the two together.
Can anyone help?
@Simon's answer above is great, but for my similar case building JSON objects diff, I want to have keys path like in JSONpath form, and not only last name, including array indexes and also values.
So, on example {"A":[[[{"C":"B"}, {"D":"E"}]]],"X":"Y", "F": {"G": "H"}}
I need not only keys X
, D
, G
, C
, F
, A
, but values on each path like .A[0][0][0].C
= 'B'.
There are also some minor enhancements like:
I hope it will be helpful for someone also:
WITH RECURSIVE _tree (key, value, type) AS (
SELECT
NULL as key
,'{"A":[[[{"C":"B"}, {"D":"E"}]]],"X":"Y", "F": {"G": "H"}}'::jsonb as value
,'object'
UNION ALL
(
WITH typed_values AS (
SELECT key, jsonb_typeof(value) as typeof, value FROM _tree
)
SELECT CONCAT(tv.key, '.', v.key), v.value, jsonb_typeof(v.value)
FROM typed_values as tv, LATERAL jsonb_each(value) v
WHERE typeof = 'object'
UNION ALL
SELECT CONCAT(tv.key, '[', n-1, ']'), element.val, jsonb_typeof(element.val)
FROM typed_values as tv, LATERAL jsonb_array_elements(value) WITH ORDINALITY as element (val, n)
WHERE typeof = 'array'
)
)
SELECT DISTINCT key, value #>> '{}' as value, type
FROM _tree
WHERE key IS NOT NULL
ORDER BY key
Dbfiddle to run.
The trick is to add some final condition testing using json_typeof
at the right place.
You should also be using jsonb
if you don't care about object key order.
Here is my working environment:
CREATE TABLE test (
id SERIAL PRIMARY KEY,
doc JSON
);
INSERT INTO test (doc) VALUES ('{
"files": {
"folder": {
"file1": {
"property": "blah"
},
"file2": {
"property": "blah"
},
"file3": {
"property": "blah"
},
"file4": {
"property": "blah",
"prop" : {
"clap": "clap"
}
}
}
},
"software": {
"apt": {
"package1": {
"version": 1.2
},
"package2": {
"version": 1.2
},
"package3": {
"version": 1.2
},
"package4": {
"version": 1.2
}
}
}
}');
The recursion is stopped when the second query does not return any rows. This is done by passing an empty object to json_each
.
WITH RECURSIVE doc_key_and_value_recursive(key, value) AS (
SELECT
t.key,
t.value
FROM test, json_each(test.doc) AS t
UNION ALL
SELECT
t.key,
t.value
FROM doc_key_and_value_recursive,
json_each(CASE
WHEN json_typeof(doc_key_and_value_recursive.value) <> 'object' THEN '{}' :: JSON
ELSE doc_key_and_value_recursive.value
END) AS t
)
SELECT *
FROM doc_key_and_value_recursive
WHERE json_typeof(doc_key_and_value_recursive.value) <> 'object';
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With