Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split a JSON file into separate files

Tags:

json

file

split

jq

I have a large JSON file that is an object of objects, which I would like to split into separate files name after object keys. Is it possible to achieve this using jq or any other off-the-shelf tools?

The original JSON is in the following format

{ "item1": {...}, "item2": {...}, ...}

Given this input I would like to produce files item1.json, item2.json etc.

like image 462
kissaprofeetta Avatar asked Feb 26 '15 13:02

kissaprofeetta


3 Answers

This should give you a start:

for f in `cat input.json | jq -r 'keys[]'` ; do
  cat input.json | jq ".$f" > $f.json
done

or when you insist on more bashy syntax like some seem to prefer:

for f in $(jq -r 'keys[]') ; do
  jq ".[\"$f\"]" < input.json > "$f.json"
done < input.json
like image 199
Hans Z. Avatar answered Nov 16 '22 00:11

Hans Z.


Here's a solution that requires only one call to jq:

jq -cr 'keys[] as $k | "\($k)\n\(.[$k])"' input.json |
  while read -r key ; do
    read -r item
    printf "%s\n" "$item" > "/tmp/$key.json"
  done

It might be faster to pipe the output of the jq command to awk, e.g.:

jq -cr 'keys[] as $k | "\($k)\t\(.[$k])"' input.json |
  awk -F\\t '{ print $2 > "/tmp/" $1 ".json" }'

Of course, these approaches will need to be modified if the key names contain characters that cannot be used in filenames.

like image 37
peak Avatar answered Nov 15 '22 23:11

peak


Is it possible to achieve this using jq or any other off-the-shelf tools?

It is. xidel can also do this very efficiently.

Let's assume 'input.json' :

{
  "item1": {
    "a": 1
  },
  "item2": {
    "b": 2
  },
  "item3": {
    "c": 3
  }
}

Inefficient Bash method:

for f in $(xidel -s input.json -e '$json()'); do
  xidel -s input.json -e '$json("'$f'")' > $f.json
done

For every object key another instance of xidel is called to parse the object. Especially when you have a very large JSON this is pretty slow.

Efficient file:write() method:

xidel -s input.json -e '
  $json() ! file:write(
    .||".json",
    $json(.),
    {"method":"json"}
  )
'

One xidel call creates 'item{1,2,3}.json'. Their content is a compact/minified object, like {"a": 1} for 'item1.json'.

xidel -s input.json -e '
  for $x in $json() return
  file:write(
    concat($x,".json"),
    $json($x),
    {
      "method":"json",
      "indent":true()
    }
  )
'

One xidel call creates 'item{1,2,3}.json'. Their content is a prettified object (because of {"indent":true()}), like...

{
  "a": 1
}

...for 'item1.json'. Different query (for-loop), same result.

This method is multitudes faster!

like image 43
Reino Avatar answered Nov 15 '22 22:11

Reino