I'm slowly getting a grip of how jq works but am still far from mastering it. Now I'm in a situation I've kinda managed to get what I wanted but not to display it the way I want. I'm sure it's quite simple but I'm missing it...
Here's a sample of the JSON I want to parse:
{
"sites": [
{
"site_id": 123456,
"status": "configured",
"domain": "www.domain.com",
"account_id": 654321,
"security": {
"waf": {
"rules": [
{
"action": "block_request",
"action_text": "Block",
"id": "sqli",
"name": "SQLi"
},
{
"action": "block_request",
"action_text": "Block",
"id": "xss",
"name": "XSS"
},
{
"action": "alert",
"action_text": "Alert",
"id": "path_vector",
"name": "Path Vector"
}
]
}
}
}
],
"res": 0,
"res_message": "OK",
"debug_info": {
"id-info": "9123"
}
}
I only need a few details and put them in CSV format, here's what I did so far:
cat test.json | jq -r '.sites [] | [.site_id,.domain],(.security.waf.rules[] | [.action_text]) | @csv'
This is the ouput I got:
123456,"www.domain.com"
"Block"
"Block"
"Alert"
Not so bad, but what I'm looking for is something like this:
123456,"www.domain.com","Block","Block","Alert"
Same result, just displayed in one single line. I went over the man pages and fiddled around for a while to no avail. Is it possible to do it or I need a different tool to manipulate it?
Thanks in advance!
First lets discuss why you received that result.
When you use []
to extract items from objects/arrays, it yields a value for every item in that object/array.
.sites[]
produces a result for the every value in your sites
array (which in this case there's only one).
Another thing to note, using a comma (,
) will yield the delimited values within that expression.
[.site_id,.domain]
The comma here produces two values, the site_id
and the domain
. But, those values are collected in to an array (as denoted by the square brackets).
Putting this into the next part of the expression
.security.waf.rules[] | [.action_text]
The first part goes through all the rules objects in that array. Then for each of those objects, creates an array containing the action_text
. This creates three arrays (one for each of the rules).
Put this together with the previous part of the expression (slightly reformatted)
([.site_id,.domain]) , (.security.waf.rules[] | [.action_text])
This all together produces four arrays, the array containing the site_id
and domain
, followed by the three arrays of action_text
.
Then for each of those four arrays, a csv row is created giving you the results you see.
So how can we get the desired results?
First, we'll want to start go through all the sites. I'm assuming you want a row per site.
.sites[]
Then for each site, we need to build an array of the values in that row. Start with what we have direct access to.
.site_id, .domain
Then produce the action_text
values.
.security.waf.rules[].action_text
Note we're not putting the action_text
in a separate array, we just want the value.
Now we put those values together.
.site_id, .domain, (.security.waf.rules[].action_text)
This creates five values as we discussed, but we want to collect them in an array so we may pass it to the @csv
filter.
[.site_id, .domain, (.security.waf.rules[].action_text)]
Putting everything together will give us this filter:
.sites[] | [.site_id, .domain, (.security.waf.rules[].action_text)] | @csv
Of course there's many approaches you can take to get these values (like building out arrays separately then combining them) but this is the most direct.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With