Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Yet struggling to parse JSON with jq

I'm slowly getting a grip of how jq works but am still far from mastering it. Now I'm in a situation I've kinda managed to get what I wanted but not to display it the way I want. I'm sure it's quite simple but I'm missing it...

Here's a sample of the JSON I want to parse:

{
    "sites": [
        {
            "site_id": 123456,
            "status": "configured",
            "domain": "www.domain.com",
            "account_id": 654321,
            "security": {
                "waf": {
                    "rules": [
                        {
                            "action": "block_request",
                            "action_text": "Block",
                            "id": "sqli",
                            "name": "SQLi"
                        },
                        {
                            "action": "block_request",
                            "action_text": "Block",
                            "id": "xss",
                            "name": "XSS"
                        },
                        {
                            "action": "alert",
                            "action_text": "Alert",
                            "id": "path_vector",
                            "name": "Path Vector"
                        }
                    ]
                }
            }
        }
    ],
    "res": 0,
    "res_message": "OK",
    "debug_info": {
        "id-info": "9123"
    }
}

I only need a few details and put them in CSV format, here's what I did so far:

cat test.json | jq -r '.sites [] | [.site_id,.domain],(.security.waf.rules[] | [.action_text]) | @csv'

This is the ouput I got:

123456,"www.domain.com"
"Block"
"Block"
"Alert"

Not so bad, but what I'm looking for is something like this:

123456,"www.domain.com","Block","Block","Alert"

Same result, just displayed in one single line. I went over the man pages and fiddled around for a while to no avail. Is it possible to do it or I need a different tool to manipulate it?

Thanks in advance!

like image 253
TuxSax Avatar asked Feb 04 '23 07:02

TuxSax


1 Answers

First lets discuss why you received that result.

When you use [] to extract items from objects/arrays, it yields a value for every item in that object/array.

.sites[]

produces a result for the every value in your sites array (which in this case there's only one).

Another thing to note, using a comma (,) will yield the delimited values within that expression.

[.site_id,.domain]

The comma here produces two values, the site_id and the domain. But, those values are collected in to an array (as denoted by the square brackets).

Putting this into the next part of the expression

.security.waf.rules[] | [.action_text]

The first part goes through all the rules objects in that array. Then for each of those objects, creates an array containing the action_text. This creates three arrays (one for each of the rules).

Put this together with the previous part of the expression (slightly reformatted)

([.site_id,.domain]) , (.security.waf.rules[] | [.action_text])

This all together produces four arrays, the array containing the site_id and domain, followed by the three arrays of action_text.

Then for each of those four arrays, a csv row is created giving you the results you see.


So how can we get the desired results?

First, we'll want to start go through all the sites. I'm assuming you want a row per site.

.sites[]

Then for each site, we need to build an array of the values in that row. Start with what we have direct access to.

.site_id, .domain

Then produce the action_text values.

.security.waf.rules[].action_text

Note we're not putting the action_text in a separate array, we just want the value.

Now we put those values together.

.site_id, .domain, (.security.waf.rules[].action_text)

This creates five values as we discussed, but we want to collect them in an array so we may pass it to the @csv filter.

[.site_id, .domain, (.security.waf.rules[].action_text)]

Putting everything together will give us this filter:

.sites[] | [.site_id, .domain, (.security.waf.rules[].action_text)] | @csv

Of course there's many approaches you can take to get these values (like building out arrays separately then combining them) but this is the most direct.

like image 141
Jeff Mercado Avatar answered Feb 11 '23 11:02

Jeff Mercado