First, your JSON has nested objects, so it normally cannot be directly converted to CSV. You need to change that to something like this: { "pk": 22, "model": "auth. permission", "codename": "add_logentry", "content_type": 8, "name": "Can add log entry" }, ......]
JSONPath distinguishes between the "root object or element" ($) and "the current object or element" (.). jq simply uses . to refer to the current JSON entity and so it is context-dependent: it can refer to items in the input stream of the jq process as a whole, or to the output of a filter.
First, obtain an array containing all the different object property names in your object array input. Those will be the columns of your CSV:
(map(keys) | add | unique) as $cols
Then, for each object in the object array input, map the column names you obtained to the corresponding properties in the object. Those will be the rows of your CSV.
map(. as $row | $cols | map($row[.])) as $rows
Finally, put the column names before the rows, as a header for the CSV, and pass the resulting row stream to the @csv
filter.
$cols, $rows[] | @csv
All together now. Remember to use the -r
flag to get the result as a raw string:
jq -r '(map(keys) | add | unique) as $cols | map(. as $row | $cols | map($row[.])) as $rows | $cols, $rows[] | @csv'
jq -r '(.[0] | keys_unsorted) as $keys | $keys, map([.[ $keys[] ]])[] | @csv'
or:
jq -r '(.[0] | keys_unsorted) as $keys | ([$keys] + map([.[ $keys[] ]])) [] | @csv'
Describing the details is tricky because jq is stream-oriented, meaning it operates on a sequence of JSON data, rather than a single value. The input JSON stream gets converted to some internal type which is passed through the filters, then encoded in an output stream at program's end. The internal type isn't modeled by JSON, and doesn't exist as a named type. It's most easily demonstrated by examining the output of a bare index (.[]
) or the comma operator (examining it directly could be done with a debugger, but that would be in terms of jq's internal data types, rather than the conceptual data types behind JSON).
$ jq -c '.[]' <<<'["a", "b"]' "a" "b" $ jq -cn '"a", "b"' "a" "b"
Note that the output isn't an array (which would be ["a", "b"]
). Compact output (the -c
option) shows that each array element (or argument to the ,
filter) becomes a separate object in the output (each is on a separate line).
A stream is like a JSON-seq, but uses newlines rather than RS as an output separator when encoded. Consequently, this internal type is referred to by the generic term "sequence" in this answer, with "stream" being reserved for the encoded input and output.
The first object's keys can be extracted with:
.[0] | keys_unsorted
Keys will generally be kept in their original order, but preserving the exact order isn't guaranteed. Consequently, they will need to be used to index the objects to get the values in the same order. This will also prevent values being in the wrong columns if some objects have a different key order.
To both output the keys as the first row and make them available for indexing, they're stored in a variable. The next stage of the pipeline then references this variable and uses the comma operator to prepend the header to the output stream.
(.[0] | keys_unsorted) as $keys | $keys, ...
The expression after the comma is a little involved. The index operator on an object can take a sequence of strings (e.g. "name", "value"
), returning a sequence of property values for those strings. $keys
is an array, not a sequence, so []
is applied to convert it to a sequence,
$keys[]
which can then be passed to .[]
.[ $keys[] ]
This, too, produces a sequence, so the array constructor is used to convert it to an array.
[.[ $keys[] ]]
This expression is to be applied to a single object. map()
is used to apply it to all objects in the outer array:
map([.[ $keys[] ]])
Lastly for this stage, this is converted to a sequence so each item becomes a separate row in the output.
map([.[ $keys[] ]])[]
Why bundle the sequence into an array within the map
only to unbundle it outside? map
produces an array; .[ $keys[] ]
produces a sequence. Applying map
to the sequence from .[ $keys[] ]
would produce an array of sequences of values, but since sequences aren't a JSON type, so you instead get a flattened array containing all the values.
["NSW","AU","state","New South Wales","AB","CA","province","Alberta","ABD","GB","council area","Aberdeenshire","AK","US","state","Alaska"]
The values from each object need to be kept separate, so that they become separate rows in the final output.
Finally, the sequence is passed through @csv
formatter.
The items can be separated late, rather than early. Instead of using the comma operator to get a sequence (passing a sequence as the right operand), the header sequence ($keys
) can be wrapped in an array, and +
used to append the array of values. This still needs to be converted to a sequence before being passed to @csv
.
I created a function that outputs an array of objects or arrays to csv with headers. The columns would be in the order of the headers.
def to_csv($headers):
def _object_to_csv:
($headers | @csv),
(.[] | [.[$headers[]]] | @csv);
def _array_to_csv:
($headers | @csv),
(.[][:$headers|length] | @csv);
if .[0]|type == "object"
then _object_to_csv
else _array_to_csv
end;
So you could use it like so:
to_csv([ "code", "name", "level", "country" ])
The following filter is slightly different in that it will ensure every value is converted to a string. (jq 1.5+)
# For an array of many objects
jq -f filter.jq [file]
# For many objects (not within array)
jq -s -f filter.jq [file]
Filter: filter.jq
def tocsv:
(map(keys)
|add
|unique
|sort
) as $cols
|map(. as $row
|$cols
|map($row[.]|tostring)
) as $rows
|$cols,$rows[]
| @csv;
tocsv
This variant of Santiago's program is also safe but ensures that the key names in the first object are used as the first column headers, in the same order as they appear in that object:
def tocsv:
if length == 0 then empty
else
(.[0] | keys_unsorted) as $keys
| (map(keys) | add | unique) as $allkeys
| ($keys + ($allkeys - $keys)) as $cols
| ($cols, (.[] as $row | $cols | map($row[.])))
| @csv
end ;
tocsv
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With