I am using jq to parse log data, occasionally the logs contain malformed stuff (invalid json), when this happens, jq aborts processing at that point.
Is there a way to have jq keep processing what it can, while reporting the problems via stderr?
I understand that if you have newlines in your JSON, jq may have trouble if it starts with the next line, but in such cases you will still eventually get to the point that you find the start of a legitimate json message and can continue processing.
With jq-1.5 I was able to do the following:
With this example file:
cat << EOF > example.log
{"a": 1}
{invalid
{"b": 2}
EOF
Output non-json lines as unquoted strings:
cat example.log | jq --raw-input --raw-output '. as $raw | try fromjson catch $raw'
{
"a": 1
}
{invalid
{
"b": 2
}
Silently skip non-json lines:
cat example.log | jq --raw-input 'fromjson?'
{
"a": 1
}
{
"b": 2
}
You can add --slurp
if the entire input is expected to be a single multiline json blob.
Example files:
cat << EOF > valid-multiline.log
{
"a": 1,
"b": 2
}
EOF
cat << EOF > invalid-multiline.log
{
asdf
"b": 2
}
EOF
Outputs
cat valid-multiline.log | jq --slurp --raw-input --raw-output '. as $raw | try fromjson catch $raw'
{
"a": 1,
"b": 2
}
cat invalid-multiline.log | jq --slurp --raw-input --raw-output '. as $raw | try fromjson catch $raw'
{
asdf
"b": 2
}
If you have jq 1.5, the answer is: yes, though in general, preprocessing (e.g. using hjson or any-json) would be preferable.
Anyway, the idea is simply to take advantage of the try/catch feature. Here is an illustration using the inputs
filter. Note that jq should in general be invoked with the -n option for this to work.
def handle: inputs | [., "length is \(length)"] ;
def process: try handle catch ("Failed", process) ;
process
[1,2,3]
{id=546456, userId=345345}
[4,5,6]
$ jq -n -f recover.jq bad.json
[
"[1,2,3]",
"length is 3"
]
"Failed"
[
"[4,5,6]",
"length is 3"
]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With