I have a large JSON file that is an object of objects, which I would like to split into separate files name after object keys. Is it possible to achieve this using jq or any other off-the-shelf tools?
The original JSON is in the following format
{ "item1": {...}, "item2": {...}, ...}
Given this input I would like to produce files item1.json, item2.json etc.
This should give you a start:
for f in `cat input.json | jq -r 'keys[]'` ; do
cat input.json | jq ".$f" > $f.json
done
or when you insist on more bashy syntax like some seem to prefer:
for f in $(jq -r 'keys[]') ; do
jq ".[\"$f\"]" < input.json > "$f.json"
done < input.json
Here's a solution that requires only one call to jq:
jq -cr 'keys[] as $k | "\($k)\n\(.[$k])"' input.json |
while read -r key ; do
read -r item
printf "%s\n" "$item" > "/tmp/$key.json"
done
It might be faster to pipe the output of the jq command to awk, e.g.:
jq -cr 'keys[] as $k | "\($k)\t\(.[$k])"' input.json |
awk -F\\t '{ print $2 > "/tmp/" $1 ".json" }'
Of course, these approaches will need to be modified if the key names contain characters that cannot be used in filenames.
Is it possible to achieve this using jq or any other off-the-shelf tools?
It is. xidel can also do this very efficiently.
Let's assume 'input.json' :
{
"item1": {
"a": 1
},
"item2": {
"b": 2
},
"item3": {
"c": 3
}
}
Inefficient Bash method:
for f in $(xidel -s input.json -e '$json()'); do
xidel -s input.json -e '$json("'$f'")' > $f.json
done
For every object key another instance of xidel
is called to parse the object. Especially when you have a very large JSON this is pretty slow.
Efficient file:write()
method:
xidel -s input.json -e '
$json() ! file:write(
.||".json",
$json(.),
{"method":"json"}
)
'
One xidel
call creates 'item{1,2,3}.json'. Their content is a compact/minified object, like {"a": 1}
for 'item1.json'.
xidel -s input.json -e '
for $x in $json() return
file:write(
concat($x,".json"),
$json($x),
{
"method":"json",
"indent":true()
}
)
'
One xidel
call creates 'item{1,2,3}.json'. Their content is a prettified object (because of {"indent":true()}
), like...
{
"a": 1
}
...for 'item1.json'. Different query (for-loop), same result.
This method is multitudes faster!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With