Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

jq: Object cannot be csv-formatted, only array

I am new to jq and I have a JSON file from a DynamoDB table which I want to convert to CSV. This is my JSON file.

[
    {
        "SnsPublishTime": {
            "S": "2019-07-27T15:07:38.904Z"
        },
        "SESreportingMTA": {
            "S": "dsn; a8-19.smtp-out.amazonses.com"
        },
        "SESMessageType": {
            "S": "Bounce"
        },
        "SESDestinationAddress": {
            "S": "[email protected]"
        },
        "SESMessageId": {
            "S": "0100016c33f91857-600a8e44-c419-4a02-bfd6-7f6908f5969e-000000"
        },
        "SESbounceSummary": {
            "S": "[{\"emailAddress\":\"[email protected]\",\"action\":\"failed\",\"status\":\"5.1.1\",\"diagnosticCode\":\"smtp; 550 5.1.1 user unknown\"}]"
        }
    }
]

I get the correct output if I run

jq -r '.[] ' test.json

but if I run

jq -r '.[] |@csv' test.json

Then I am getting an error:

jq: error (at test.json:22): object ({"SnsPublis...) cannot be csv-formatted, only array

How can I convert this JSON to a CSV properly? I tried googling for over an hour and can't seem to be able to figure it out.

Thank you!

like image 889
Kliment Avatar asked Jul 28 '19 15:07

Kliment


1 Answers

For the record, here is a generic JSON-to-CSV converter for converting any array of JSON objects to CSV (with headers). There are no restrictions on these objects, but the transformation is not always invertible, and the output cells might include stringified compound entities -- see "Caveats".

json2csv

# emit a stream
def json2headers:
  def isscalar: type | . != "array" and . != "object";
  def isflat: all(.[]; isscalar);
  paths as $p
  | getpath($p)
  | if type == "array" and isflat then $p
     elif isscalar and (($p[-1]|type) == "string") then $p
     else empty end ;

def json2array($header):
  def value($p):
    try getpath($p) catch null
    | if type == "object" then null else . end;
  [$header[] as $p | value($p)];

def json2csv:
  ( [.[] | json2headers] | unique) as $h
  | ([$h[]|join("_") ],
     (.[]
      | json2array($h)
      | map( if type == "array" then map(tostring)|join("|") else tostring end)))
  | @csv ;

Usage

One way to use json2csv.jq as specified above is as a jq module, e.g.

jq -r -L. 'include "json2csv"; json2csv' input.json

If the input is a stream of JSON objects:

jq -rn -L. 'include "json2csv"; [inputs]|json2csv' input.json

Caveats

  • For each object in the top-level array, the set of paths to all scalars and scalar-valued arrays is computed; if any such path is object-valued or invalid for another object, the corresponding cell in the output for that object will be "null".

  • Flat arrays are converted to pipe-separated values, so that if the input includes an array such as ["1|2", ["3|4"], it will be indistinguishable from the string value, "1|2|3|4", etc. If this is a problem, the character used as a separator value for array items can of course be changed.

  • Similar collisions can occur amongst the headers.

Conversion to TSV

sed 's/@csv/@tsv/' json2csv.jq > json2tsv.jq
like image 176
peak Avatar answered Oct 14 '22 03:10

peak